Current Opportunities

No Opportunities to display

Archiving versus Management ROI; and the winner is....

No replies
Dave Kinchlea's picture
Dave Kinchlea
The Online Livelink Performance Suite is truly a thing of beauty!
User offline. Last seen 9 weeks 4 days ago. Offline
Joined: 2009-04-22
Trust Score: 500

One of the big problems with "Archiving" of data that software vendors must understand is that it is handled by IT and so costs and Return On Investment are focused around infrastructure like Tier 1 versus Tier 2 storage. Sometimes that is a sufficient metric to justify archiving but often it only tells part of the story and the least interesting part at that. In this note I'll examine some straw man archiving & management scenarios and see how the numbers pan out.

In the archiving world, "bytes are king"; by that I mean that the measurement used by most vendors of archiving solutions and the thinking of  most of the users of those solutions revolves around the size of the files, documents, and other objects being archived. The act of archiving and the consequences on the applications being archived from, is thought of as a removal of a significant volume of data. For file systems that means a reduction in the total number of used bytes, for a database application like MS Exchange or SAP it means a reduction in used tablespace. The point being that the measurement is the number of bytes being removed from one application and put into another. But the Open Text Content Server doesn't really think in terms of bytes for most things it does, rather it thinks in terms of objects. Depending on the content, this can have positive and negative consequences.

When we look at file systems, Livelink's view of the world is a good thing ... the typical candidates for archiving are large and old files, the point basically being to get rid of the outliers (the 20:80 rule where 20% of the content consumes 80% of a resource). This also works for SAP and other DB-centric archiving because the archived objects are usually stored as a single unit. However, this is most definitely not true for email.

Let's assume:

  • there are two storage services,
    • Tier 1 has 1TB of highly available SAN/DAS/iSCSI at (say) $25/GB and
    • Tier 2 has 10TB of SATA-based DAS/NAS/iSCSI at $10/GB
  • the metadata for a file in OTCS is 2.5k
  • the metadata for an email in OTCS is 750bytes
  • the average email size is 20k (which equals approximately 100k emails / 2GB PST file)
  • the index is 25% of original content
  • RAM cost is $90/GB

With those assumptions in mind, the potential savings for archiving a 2GB file from a file system into OTCS is: (2*25)-(2*10 + (.25*2*25) + (2*0.00025*90) = $17.45 (with full-text index) or about a 35% savings and $30 without full-text index or about a 60% savings, and though there are many other costs not listed here for simplicity, in both cases that is a pretty compelling reason to at least look at the solution.

Now lets do the same but for 2GB of email within an Exchange server: (2*90) - (2*10 + (.25*2*25) +( 200,000*0.000075*90)) = $-1202.5 or about -668% (to be clear, a major INCREASE in cost). Let me be very clear here, there are far more costs involved in an Exchange server than just RAM and one of the main and expected benefits of archiving email is to ensure that a new Exchange server is not required AND let me also point out that nobody actually does email archiving in this way, much more typically the archiving would be restricted to those emails with attachments of 1MB or more ... that changes those numbers dramatically to (2*90) -(2*10 +(.25*2*25) + (2,000*0.000075*90)) = $134 or a net savings of about 74%. Once again a compelling story worthy of more investigation.


Now lets look at a management task, PST migration -- here we have as a goal the elimination of PST files and so we do not have the luxury of picking and choosing what is to be archived and so we will again use the 20k/100k figure but now the starting cost is even less (as a PST file is most likely stored on a local hard drive not Tier 1 storage let alone within RAM) ... in fact the net storage savings is zero (likely less than that even) so all we are interested in is the cost of categorizing, and indexing the emails: (.25*2*25) +( 200,000*0.000075*90) = $-1362.5 or about 6812.5% more than the original cost. There is certainly no savings to be had there but then storage costs are NOT the reason for eliminating PST files, that is a risk mitigation move. The costs you must look at here are the potential risks of a Legal Discovery request requiring essentially the same activity as the migration. Having the data at the ready means that such Discovery requests are far less expensive to perform and more significantly the amount of data matching a request are usually far less (because searches can more accurately narrow the scope); keep in mind that a legal team will have to review all the data returned from a legal discovery request and the cost of that time could be as high as $600/hour. So one does NOT want to provide PST files to legal teams!

In any case, it is clear that the savings will not be found at the IT infrastructure level, to the contrary PST migration could well be a costly endeavour. Let's assume a company with 1,000 people each with a 2GB PST file, we now have 100 million emails and the cost according to our calculations above is a whopping ((.25*2,000*25) + (200,000,000 * 0.000075 * 90)) = $1,362,500 or about $0.013625 / email or  $1,362.50 / person.

But, let's assume that it takes a human 30 seconds to decide that an email is not interesting in a discovery request, then in one hour 7,200 emails can be rejected @ (say) $300/hr = $0.04167 / email more than 3 times as much as the proactive choice. Now imagine having to service a legal discovery quarterly and you can see how the savings can really add up. Clearly if a legal discovery request is ever made, there is a clear savings to be had if the work is done proactively but it is a significant expense otherwise.\

Archiving is about saving 30-75% on basic IT costs over the lifetime of the content and that is obviously a good thing but in some circumstances "Management" is about saving 300% or more yearly (and in others it is about an expense that is just as high!).

The clear differentiator that is the Open Text Content Server is that the management solutions and archiving solutions are fully integrated so that management can also enjoy the cost savings that archiving provides ... this is not the usual course of affairs in either the archving (IT) or managment (legal) space.

 

User login