Looker, Hadoop and ice hockey

  • 28 March 2022
  • 0 replies

Userlevel 5

This content, written by Frank Bien, was initially posted in Looker Blog on Feb 8, 2016. The content is subject to limited support.

One of my favorite quotes is from the Great One, Wayne Gretzky, who said,

“I skate to where the puck is going to be. Not where it’s been.”

Today, we just announced something that is our version of skating to where the puck is going to be. For years, everyone has been talking about , Big Data, and the innovations in data storage and infrastructure. And almost all analytics vendors jumped in to try to build tools that could analyze massive datasets in these cutting-edge new storage systems. But, as innovative as it was for storing data, the Hadoop ecosystem was not ready to do business analytics—and by that I mean fast, interactive business analysis and exploration of enormous quantities of data. The vendors that jumped in early had to build complicated systems that transformed, moved, cubed, and generally messed with the data in Hadoop. And, at the end of it all, none of the solutions allowed for the speed and repeatability that’s necessary for business analytics to be truly valuable.

And what did we do at Looker? We waited.

We knew that the Hadoop ecosystem would eventually get to the point when it was no longer necessary to move data from Hadoop and instead SQL-based analysis could be performed directly on the data where it sits. A point at which the schema-on-read technology that we have perfected at Looker, could bring actual business analytics to Big Data. That basic decision-making analysis that all businesses are desperate for, but can easily get forgotten among the flash and sizzle of machine-learning and predictive analysis chatter that pepper the blogs and news.

Where are we today? Today we announced that with the improvements in the speed and performance of connectors like Spark SQL and Presto, updates on Impala and Hive, we have customers like Acorns and Yahoo doing their business analytics directly on Hadoop. Actionable, reliable analytics over all the data in their massive data stores - not just a sub-set.

Want to learn more?

Check out the full read in our

0 replies

Be the first to reply!