Читать книгу Fail Fast, Learn Faster - Randy Bean - Страница 12
The Big Data Value Proposition
ОглавлениеOn a picture-perfect May morning in 2012 on the campus of Stanford University in Palo Alto, California, I attended the inaugural Big Data conference at the invitation of Accel Partners. This event represented a milestone in the history of Big Data. Accel Partners had emerged as the hottest venture firm in Silicon Valley on the heels of their successful investment in Facebook. The event was billed as a “who's who” of the Silicon Valley technology elite. A parade of notable speakers, including Andy Bechtolsheim, co-founder of Sun Microsystems and Doug Cutting, originator of Hadoop and chief architect at Cloudera, the latest Accel investment, followed one another, each extolling the “revolutionary” potential of Big Data. The speakers set forth grand claims for the promise and potential of Big Data. “Big Data was the next wave in technology innovation,” remarked one speaker.5 “Big Data would change how the world made use of information,” stated another. “Big Data would enable insights that would change society,” commented a third.
Each speaker extolled the technological underpinnings that made Big Data truly compelling, which was the notion that you could simply take your data and “load and go,” thereby shortcutting the massive effort of data preparation and data engineering that was the bane of corporate existence, comprising 80% of data activities. This specifically pertained to the tremendous time and effort required to transform data (“dirty data”) into a usable asset that has meaningful business value. Whole companies and an entire industry had been built to respond to this need. It was the ongoing lament of many a data analyst who complained that they spent “20% of our time on data analysis, but 80% of our time on accessing and preparing the data.”
The big attraction and promise of Big Data for many corporations was the ability to bypass the hundreds of hours of upfront data engineering to access the data much sooner and more easily, for purposes of analysis and putting this information to good use. Anyone who has ever worked in the corporate world knows the painful refrain to how long it takes to answer a new business question that requires adding a new data source: “Fifteen months and five million dollars.” Senior business executives were resigned to a state in which getting value out of data quickly was not something that they could expect to see in their business lifetimes. Then, a cadre of engineers, data experts, and venture investors were heralding a breakthrough. Data would be liberated from the tyranny of the data gatekeepers. Data could now be made available quicker and faster to all.
The radical implication of the load-and-go notion was that data users would no longer have to wait through the long and arduous processes of data engineering that had long hamstrung and thwarted the ambitions of data analysts. Speed would be realized through the ability to shorten the cycle from data access to analytical results. An analogy would be the reduction in travel time resulting from the railroad versus the horse and buggy, or air transport versus travel by land or sea. Big Data would make it possible to get more done for less by lowering the cost of managing data by reducing the need for costly upfront data preparation and data engineering, the dreaded 80% of time and effort.