Big Data Technologies

Spark Structured Streaming in Practice

Bigstep Solution Architect Andrei Muraru @ the HUG UK & Big Data Analytics London Meetup

How does Spark Structured Streaming work with real-time big data workloads? Here’s a case study presented by Bigstep Solution Architect Andrei Muraru, during the Big Data Week 2016 global festival.

Spark Structured Streaming provides the means to express streaming computations similarly to those deployed on static data. The built-in engine incrementally and continuously updates the final results as streaming data continues to arrive. Andrei’s presentation covers how a real-life implementation of Spark Structured Streaming on top of a Hadoop Cluster is helping a big online retailer to analyze clickstream data and aggregate it with customer history information. Continue Reading

Building Data Lakes in the Cloud

Understand why building a data lake in the cloud entails different particularities than building it on premises

Every industry has both proven and potential data lake use cases. With enterprise data warehouses (EDWs) being rendered ever more inefficient when facing new business needs, cloud-based data lakes have been gaining popularity with enterprises looking to cover the technology gap. Cloud data lakes are purpose-built to meet the data management requirements of the evolving enterprise landscape. Continue Reading

How to Choose the Right NoSQL Database for All the Right Reasons

If you’re taking on big data, you’ll quickly realize that that means taking on a lot of unstructured data. Unstructured data doesn’t fit into a typical relational database, like SQL. That means you’ll be looking into a new type of database that can accept and store unstructured data. Primarily, that means you’ll be selecting a NoSQL database. Continue Reading

5 Things Driving the Phenomenal Success of Apache Spark

If Hadoop is the tool that ushered in the era of big data, Spark is the one that’s driving the next phase of its evolution. Spark is the brainchild of a group of Berkeley grad students, and brought an entirely new set of use cases for big data and data analytics. Spark is more than just the right thing at the right time, it’s a way to get speed out of analytics where before there was power, but nothing in the way of real-time. So, what is sparking the popularity of Spark? Continue Reading