Current Opportunities

No Opportunities to display

Drupal Performance

No replies
Dave Kinchlea's picture
Dave Kinchlea
The Online Livelink Performance Suite is truly a thing of beauty!
User offline. Last seen 9 weeks 4 days ago. Offline
Joined: 2009-04-22
Trust Score: 500

Given that GATE Village is built on Drupal it seems appropriate I should speak to Drupal performance in a general sense -- it's ability to perform was actually one of it's selling points for me. On of the most significant performance impact on Drupal is, of course, at the DB ... while there is a lot of logic in the code, all of the data lives in the DB and thus DB performance and DB bloat are always front and centre to a Drupal installation.

I'll digress for a moment to suggest that I think Drupal's choice to so heavily use a database to contain content has both positive and negative consequences. On the plus side, when the DB is of sufficient size to house the entire database (or at least all indexes) within RAM then there is almost no faster or better approach; it can be very flexible and is certainly able to perform. However, it degrades very badly when resources are scarce and it doesn't present any nice solutions when trouble does rear its ugly head -- usually some choice between bigger/more resources, dumping (archiving) content, and re-writing code to accommodate. None of these are good situations and are the inherent peril of a true CMS like Drupal. Open Text's Livelink has an advantage in that it is built around the management of discrete and external objects -- files, documents, even paper and thus the vast majority of bytes of data are not in the DB but stored on file systems. A typical deployment would see the DB around 1/10th the size of the file system holding the content. It wasn't always the case for them, however, they started by putting all content into the DB as seems natural, but they have been in business since when computers saw RAM as a very expensive resource to be restricted whenever possible. That's not so true anymore but is somewhat moot to the point here.

While Drupal clearly has the ability to host files within (virtual) folders, that is the exception to content not the rule. Most content is made up from records within one or more tables within a DB, GATE Village, for instance, has some content that requires over a dozen tables. This allows for (is because of) the reuse of content types -- specifically using the Content Creation Kit (CCK) and has some fairly serious issues when it comes to compliance, corporate governance, and other data integrity issues but can be a win from a performance standpoint in that individual records in a table stay relatively small.

Hooks and Callbacks

As even the noob Drupal programmer knows, Drupal's method of hooks makes it VERY easy to modify the behaviour of virtually any bit of code in the system --- that is both the power and the Achilles' Heel of Drupal. To perform most transactions requires the reading of and sorting through hundreds of potential functions (looking for functions of a particular naming pattern). Thus, to some extent at least, the more inherent functionality within your Drupal site the worse the performance. An exceptionally functional site like GATE Village that looks to provide nearly every bell and whistle any user could want very quickly finds itself a white elephant of a web site....with great functionality that nobody will use because if performs so slowly. As we built out GATE Village we hit average transaction times in the 6 second range at one point and I can tell you with certainty that no web-based service can survive with that sort of poor performance unless you really have a niche market (you are solving a difficult business problem).

The difficulty is that most modules are built using other modules ... which is exactly what we want to see, but many modules also come with their own sub-modules and before you realize it you have hundreds of modules. GATE Village currently has over 350 modules enabled (these include all core modules as well) out of about 500 available and the smallest Trusted Neighbourhood (so far) has 218 enabled and there IS a noticeable difference in performance between GATEVillage.net and Rantcomputing.com! Frankly, GATEVillage as a community itself is not viable with the current performance (though I have certainly been forced to use worse over the years) but Rantcomputing.com is quite snappy and satisfying. Yet they share the same DB, code, and content, just not the same functionality.

Anonymous Viewers are NOT Patient

GATE Village was built as a platform to do business, both as an Intranet and an Extranet -- that is, we have the functionality required to conduct your business and manage your business. However, when it comes to visitors to your site functionality is of little interest; what is important is that they can find what they are looking for (or what you are trying to get them to find) and you had best deliver the goods at speed. In other words, while authenticated users taking advantage of the functionality have some inherent willingness to compromise on performance, visitors have no patience at all.

Now the saving grace here is that Visitors don't really know what is current content, unless you decide to tell them. A page of content that is created, cached, and served directly for (say) one week is still "new" to a first-time visitor. This point means that a site that hopes to attract visitors need to create strategies that allow for speed and there are really only two possible approaches:

  1. Do a minimal amount of work (few graphics and dynamic functions); or
  2. Do the work you need to do before you need to do it (Cache)
  3. .

Most sites these days seem to have chosen 1 -- some for obvious reasons (a cached facebook or twitter page isn't what one expects) and some because it makes good business sense to do so if you can get away with it. That is, if your visitors are drawn to your site based upon some service you provide that they are quite willing to put up with drab, but most businesses have to work a bit harder with fancy graphics, embedded multimedia, and other tantalizing features that look great but take a fair amount of processing power to create.

GATE Village has provided the ability to do either or both; with the ability to choose a small subset of modules and make a lean, fast web site those communities that need instant updates of major content can be accommodated, with the use of AJAX and JSON it is possible to update only small portions of a web page (for instance a shopping cart, a bid, or a new comment) and using Boost (http://drupal.org/project/Boost) we also provide static HTML pages of all public content providing the ultimate in performance (which pages themselves can use AJAX and so you can do this and still run an online store).

User login