Posts tagged ‘NoSQL performance benchmark’

Memory, Big Data, NoSQL and Virtualization

In-memory processing has started to become the norm in large-scale data handling. This is aclose to the metal analysis of highly important but often neglected aspects of memory access times and how it impacts big data and NoSQL technologies.

We cover aspects such as the TLB, the Transparent Huge Pages, the QPI Link, Hyperthreading and the impact of virtualization on high-memory footprint applications. We present benchmarks of various technologies ranging from Cloudera’s Impala to Couchbase and how they are impacted by the underlying hardware.

The key takeaway for the presentation bellow is a better understanding of how to size a cluster, how to choose a cloud provider and an instance type for big data and NoSQL workloads and why not every core or GB of RAM is created equal.

 

If you have any question, let us know in the comments.

How To Benchmark NoSQL Databases

People make wrong design decisions all the time. Typical developers are under immense pressure to meet a deadline and they very often just take gut decisions and choose technologies they haven’t worked with before because they sound good on paper. The vice-versa is also often true, senior developers go with proven technologies and their known pitfalls because they simply don’t think other technologies can be better.

There are two ways to get a feel of your environment’s performance profile: Continue Reading

NoSQL Performance: Measuring Couchbase Performance with Couchdoop

Calin-Andrei Burloiu, Big Data Engineer at antivirus company Avira, and Radu Pastia, Senior Software Developer in the Big Data Team at Orange,  are the team behind Couchdoop – a high performance connector for bridging Hadoop and Couchbase.

Calin and Radu ran their CDH + Couchbase setup on the Full Metal Cloud and documented the performance of Couchdoop, when varying  environment parameters. These are their findings.

Use case

Avira’s large scale applications have a traditional 2-tier architecture:

• Analytical tier built around the Hadoop ecosystem which crunches large amounts of user event logs. We use Cloudera’s Distribution of Hadoop (CDH).
• Real-time tier which exposes web services to almost 100 million users. This tier requires a high performance database, and we decided to use Couchbase, which is known for its sub-millisecond response time.
However, when we tried to integrate the two technologies, Hadoop (CDH) and Couchbase, we soon reached the conclusion that current solutions just created a bottleneck. So we decided to write our own and Couchdoop, a high performance Hadoop connector for Couchbase, was born. Continue Reading

NoSQL Performance Benchmarks Series: Couchbase

This is the first of a series of performance benchmarks on NoSQL DBs that we plan to share with you. Our goal is to understand the various scaling profiles of distributed database technologies as well as identify environments that provide optimum performance/price.  Many of our findings can be applied to on premise infrastructure as well and even some cloud scenarios.

This performance benchmark on Couchbase shows sub-millisecond response times but also a difference between GET/PUT operations and QUERY operations when multiple instances are added to the cluster. We have also tested the Memory-Access-Time sensitivity of Couchbase. Continue Reading