Oracle OpenWorld started yesterday, and Larry Ellison announced new in-memory options in Oracle 12c. According to PCWeek “The Big Memory Machine costs \$3 million, a sum Ellison termed as ‘a fraction’ of what competitors charge.”
The core of the newly announced Oracle technology is a columnar store. It is in fact a fascinating technology that leverages compression, vectorized execution using SIMD, and parallel query processing. However, this technology has been on the market for a long time and has started to commoditize. Oracle is probably the last big vendor to come up with columnar technology.
But even if you pay big dollars, you don’t solve your analytics problem for good. There is one common property among all analytics workloads: they grow. Suddenly 32 TB is not all that big. And the price! Nearly $100,000 per TB. By stark contrast, the cost of commodity DRAM is now about $5,000 per TB, and FusionIO, which is almost as fast as DRAM, is about \$2,000 per TB. With a vertically-scaled architecture 32 TB is shockingly expensive. You either need to switch to a bigger appliance or move to Exadata or Oracle Rac.
Oracle is pushing expensive hardware for a reason. It allows Oracle to justify charging much higher licensing fees. It’s a win-win for Oracle on the hardware and software, and a lose-lose for the customer that needs to balance their IT budget.
Columnar technology is a very important checkbox for Oracle, but by focusing on it, they divert attention from the more important trend: the ability to run on commodity hardware. If you look at Google and Facebook data centers you’ll find that everything is engineered around commodity hardware. It applies even more so for in-memory technology. The reason is that in-memory stores eliminate I/O costs, which enables incredible performance on commodity servers where DRAM is just as fast.
As we look at some current real-world use cases for in-memory databases, we see an absolute explosion of requirements in many sectors. For example, in ad-tech, it’s not just dealing with Big Data – in reality they are dealing with “ungodly” data volumes and velocities. Customers require the ability to ingest 500,000,000 events/hour while concurrently processing real-time analytical workloads and storing petabytes of historical data. This kind of data very quickly exceeds 32 TB, yet analyzing it in real-time is still incredibly important. SingleStore routinely runs on hundreds of nodes providing incredible storage and computing power on commodity virtualized cloud infrastructure with high availability.
With technologies like Amazon Web Services (AWS), as well as private cloud offerings from VMware and OpenStack, the world is moving towards elastic computing. Just like mainframes, appliances don’t fit in the new world where customers think in terms of nodes in data centers, availability zones, and geographically distributed storage and compute. The new world is being built on distributed systems.