But for all the talk of “Big data” and how daunting it all is, I think data levels are going to be far bigger than we estimate now. As far as I can tell, most of the models of data usage look at developed markets, and extrapolate the phenomenal growth in data from use of smartphones, PC usage, companies etc.
But this underestimates the usage of data in the developing world. Many countries are going to run straight through the non-networked, 2G world and join the data-everywhere, cloud-based, streaming world instead. And this has big implications for data.
The EMC Digital Universe infographic (pdf) suggests exabyte growth of the total world data from 1,227 in 2010 to 7,910 in 2015. Although this looks like a huge increase compared to 2005 to 2010, when world data was estimated to go from 130 exabytes to 1,227, the actual rate of growth they predict is slowing, from a factor of 9.4 to 6.4.
Instead, take a look at the McKinsey report into big data (pdf). On page 103 we can see a rough breakdown of data storage by world region. If we take North America as the target level, that region uses 6.5 petabytes per million people. Run the rest of the world at that level of data usage, and the world total of 6,750 petabytes goes up over 5 times to 37,296 petabytes. See table below.
Now the rest of the world isn’t going to catch the US in the next 5 years in terms of data usage, but you get the idea of the scale of this. China is currently on 0.2 petabytes per million. India is even lower. Working on models of developed countries is fine for now, but the rest of the world will catch up faster, and use far more data. I’d rip up a few of those models and predictions and start again.
|Region||Petabytes||Population (m) (Source: Wolfram Alpha)||Petabytes per million people||Petabytes assuming North American data usage||Percentage change|
|Rest of APAC *||300||725||0.4||4,717||1,472|
* Rest of Apac population taken from Wikipedia, with Japan, China (incl HK and Macau) and India removed.