"There was 5 exabytes of information created between the dawn of civilization through 2003," Schmidt said, "but that much information is now created every 2 days, and the pace is increasing"
I've seen the above tossed around a bit lately. It's generally shared with a bit of surprise tossed in. I am in fact a huge fan of open data, and our increasing ability to do fascinating things with it. But I don't think the the above warrants any surprise - it helps to remember that "the amount of data created" hasn't increased, only our ability to usefully capture it.
One way to look at this is with the following assumption: if I snap my fingers, I've created data. We don't typically look at it that way, and this wouldn't count as 'data' as defined in the quote above. But I could very well snap my fingers before 2003 - why doesn't this count as data?
The answer is that for us to count something as data requires only two things: 1) we can quantify it, and 2) we can do something useful with the quantification of it.
What happened in 2003? Before, if I snapped my fingers, nothing quantifiable happened. The difference is simple: now, if I snap my fingers, I can potentially have some type of acoustically-sensitive piece of sensor-based technology easily attached somewhere on my body, a device whose sole function is to translate physical acoustic wave properties into numbers.
That solves requirement #1. Requirement #2 is more interesting: what would I do with that information?
The answer is: I don't know yet.
Perhaps I collect it over time to see how many times I snap my fingers in a year. Maybe that's useful - I don't know yet. Perhaps I connect my sensor to Pachube, and use it to unlock a connected door whenever I snap my fingers. Maybe that's useful - I don't know yet.
This is precisely what is so fascinating about the rise of open data - we have tons of it and we're just now learning how to extract value from it. What is the value of tracking and sharing every purchase you make via Blippy or every link you click via Voyurl or every place you visit via Foursquare or every meal you eat via Foodspotting?
I could easily conjecture on some potential values here, but to varying degrees there are a lot of people out there who say there is no value and that these things are a waste of time.
But the argument sounds a lot like the old musings that used to come up: "why on earth would it be that important to have a compass on an iPhone? What could you possibly do with that (besides find North)??"
What people are going to do with their open data (and others' data!) I don't know - but I do know there are a ton of people out there thinking about it. And it's the combination of this data/sensor tech/sharing that's ultimately valuable; if you take video recognition and combine it with that useless compass - you've got augmented reality. If you take the shared usage behavior of people's location data along with say their biometric health info, you've got a real-time sense of what health conditions are surfacing where.
So it helps to remember two more things as well: 1) our ability to capture data is only going to get better 2) the number of things data is be useful for will only increase. We might not know what we're doing with it yet (and there'll be a lot of debate of what is 'valuable' along the way - should deep biometric data be shared?), but as we figure it out it'll be increasingly more fantastic.
edit: I just coincidentally ran into this appropriate passage, during a reading of Bruce Sterling's Shaping Things: