Richard Wray, writing some time ago in The Guardian, pointed out that the volume of data held is now estimated at 487 billion GB. To put this in perspective he explained that in printed form this would form a pile that would stretch to Pluto 10 times over. The really staggering statistic, however, was that if this data were printed then the stack would grow faster than NASA’s fastest rocket. I haven’t checked the stats, but a quick back of the envelope calculation suggests he’s in the right order of magnitude.
What does this mean? Apart from the staggering numbers, it tells us that the problem for organisations isn’t holding large amounts of information – they already do that. Nor is the problem necessarily how to index that information – increasingly they have defined information standards to do that. The real problem is its continual growth – very few taxonomies or models properly account for the rapid rate of growth.
A new generation of Information Management techniques are starting to appear which are designed to deal less with the data you have now and more with the data that you are likely to gain in the future. My next few blog entries will introduce a couple of these techniques for both structured and unstructured data.