If you’re taking on big data, you’ll quickly realize that that means taking on a lot of unstructured data. Unstructured data doesn’t fit into a typical relational database, like SQL. That means you’ll be looking into a new type of database that can accept and store unstructured data. Primarily, that means you’ll be selecting a NoSQL database. Continue Reading
Don’t you just love a heated debate among tech geeks? Along with arguments over the rightful place of cloud computing, the merits and risks of BYOD, and how to leverage the IoT, you can sit ringside while database administrators duke it out over NoSQL versus relational databases, usually SQL. Which one is better? Which one best suits your needs? Grab a ringside seat and let’s find out. Continue Reading
Calin-Andrei Burloiu, Big Data Engineer at antivirus company Avira, and Radu Pastia, Senior Software Developer in the Big Data Team at Orange, are the team behind Couchdoop – a high performance connector for bridging Hadoop and Couchbase.
Calin and Radu ran their CDH + Couchbase setup on the Full Metal Cloud and documented the performance of Couchdoop, when varying environment parameters. These are their findings.
Avira’s large scale applications have a traditional 2-tier architecture:
• Analytical tier built around the Hadoop ecosystem which crunches large amounts of user event logs. We use Cloudera’s Distribution of Hadoop (CDH).
• Real-time tier which exposes web services to almost 100 million users. This tier requires a high performance database, and we decided to use Couchbase, which is known for its sub-millisecond response time.
However, when we tried to integrate the two technologies, Hadoop (CDH) and Couchbase, we soon reached the conclusion that current solutions just created a bottleneck. So we decided to write our own and Couchdoop, a high performance Hadoop connector for Couchbase, was born. Continue Reading
This is the first of a series of performance benchmarks on NoSQL DBs that we plan to share with you. Our goal is to understand the various scaling profiles of distributed database technologies as well as identify environments that provide optimum performance/price. Many of our findings can be applied to on premise infrastructure as well and even some cloud scenarios.
This performance benchmark on Couchbase shows sub-millisecond response times but also a difference between GET/PUT operations and QUERY operations when multiple instances are added to the cluster. We have also tested the Memory-Access-Time sensitivity of Couchbase. Continue Reading