Bigstep Solution Architect Andrei Muraru @ the HUG UK & Big Data Analytics London Meetup
How does Spark Structured Streaming work with real-time big data workloads? Here’s a case study presented by Bigstep Solution Architect Andrei Muraru, during the Big Data Week 2016 global festival.
Spark Structured Streaming provides the means to express streaming computations similarly to those deployed on static data. The built-in engine incrementally and continuously updates the final results as streaming data continues to arrive. Andrei’s presentation covers how a real-life implementation of Spark Structured Streaming on top of a Hadoop Cluster is helping a big online retailer to analyze clickstream data and aggregate it with customer history information. Continue Reading
Every industry has both proven and potential data lake use cases. With enterprise data warehouses (EDWs) being rendered ever more inefficient when facing new business needs, cloud-based data lakes have been gaining popularity with enterprises looking to cover the technology gap. Cloud data lakes are purpose-built to meet the data management requirements of the evolving enterprise landscape. Continue Reading
If you’re taking on big data, you’ll quickly realize that that means taking on a lot of unstructured data. Unstructured data doesn’t fit into a typical relational database, like SQL. That means you’ll be looking into a new type of database that can accept and store unstructured data. Primarily, that means you’ll be selecting a NoSQL database. Continue Reading
Remember a few years ago, when Hadoop took knocks left and right for lacking usability, security, and other key features and functionality? Well, no more. A couple of weeks ago, Hortonworks revealed the latest version of Hortonworks DataFlow (HDF), its integrated system that allows for dataflow management and streaming analytics. Continue Reading
If Hadoop is the tool that ushered in the era of big data, Spark is the one that’s driving the next phase of its evolution. Spark is the brainchild of a group of Berkeley grad students, and brought an entirely new set of use cases for big data and data analytics. Spark is more than just the right thing at the right time, it’s a way to get speed out of analytics where before there was power, but nothing in the way of real-time. So, what is sparking the popularity of Spark? Continue Reading