How Intel’s Newest Product Enhancements Could Redefine the Future of Infrastructure Design

Friday April 12, 2019. 03:55 PM , from The Apple Blog

On the 2nd of April, at their Data Centric Innovation Day, Intel announced a slew of new products, including brand spanking new ones as well as updates to existing product line-ups. There was some news for everybody – whether your specific interests are in datacenter, or edge, or infrastructures. Among other things, what impressed me the most was the new Optane DC Persistent Memory DIMM, and even more so its implications when coupled with the new (56-core) Intel Xeon Platinium 9200 CPU.
Datacenter Optanization!
Up until yesterday, Optane was already considered a great technology, although to some extent it was seen as a niche product. Although more and more vendors are adopting it as an adequate substitute of NVDIMMs as a tier 0 or cache for their storage systems, it was still harder to foresee a broader adoption of the technology. Yes, it is cheaper than RAM and faster than a standard NAND-based device, but that’s about it.
Maybe it was part of Intel’s strategy. In fact, the first generation of Intel Optane products was developed with Micron, and with the first generation of products perhaps they didn’t want to be too aggressive — something which has most likely radically changed after their divorce. The introduction of Optane DC Persistent Memory DIMM actually offers a good idea of the real potential and benefits of this technology and demonstrates how this could change the way infrastructures will be designed in the future, especially for all data-intensive workloads. In practice, in a very simplistic way, Optane DC Persistent Memory DIMMs work in conjunction with standard DDR4 RAM DIMMs. They are bigger (up to 512GB each) and slightly slower than DDR4 DIMMs, but they allow configuration of servers with several TBs of memory at a very low cost. Optane is slower than RAM, but caching techniques and the fact that these DIMMs sit next to the CPU make the solution a good compromise. At the end of the day, you are trading some performance for a huge amount of capacity, avoiding data starvation for the CPU that otherwise should access data on SSD or, even worse, HDDs or network.
How Does It Work?
There are two operation modes for Optane DIMMs, persistent and non-persistent. I know it could sound confusing, but it’s actually very straight forward.
The way the DIMM operates is selected at the beginning of the bootstrap process. When non-persistent mode is selected, the DIMMs look like RAM, and the real RAM is used as a cache. This means that you don’t have to re-write your app, and practically any application can benefit from the increased memory capacity. On the other hand, when Optane DIMMs operate in persistent mode, it is the application that stores data directly in the Optane and manages RAM and Optane as it fits and, as the name suggests, your data is still there after a reboot. SAP HANA, for example, has already demonstrated this mode and there are several other applications that will follow suit. Take a look at the video below, recorded at the Tech Field Day event that followed the main presentation, its a good deep dive on the product.

There Are Several Benefits (And a Few Trade-Offs)
All the performance tests shown during the event demonstrated the benefits of this solution. Long story short, and crystal clear, every application that relays on large amounts of RAM to work will see a huge benefit. This is mostly because the latency to reach active data is lower than any other solution on the market, but it is also because it will do so at a cost that is a fraction of what you’d get from a 100% configuration. This is not possible today due to the size of the DIMMs and the cost. All of this translates into faster results, fewer servers to obtain them, better efficiency and, of course, lower infrastructure costs.
Furthermore, with more and more applications taking advantage of the persistent memory mode, we will see interesting applications with Optane DIMMs that could replace traditional, and expensive, NVDIMMs in many scenarios like, for example, storage controllers. What are the trade-offs then? Actually, these are not real trade-offs, but the consequences of the introduction of this innovative technology. In fact, Optane DIMMs work only on new servers based on the latest Intel CPUs, those announced alongside the DIMMs. The reason for this is that the memory controller is in the CPU, and older CPUs wouldn’t be able to understand the Optane DIMM nor manage the interaction between RAM and Optane.

Maintaining the Balance
As mentioned earlier, Intel announced many other products alongside Optane Persistent Memory DIMMs. All of them are quite impressive and, at the same time, necessary to justify each other; meaning that it would be useless to have multi-TB RAM systems without a CPU strong enough to get the work done. The same goes for the network, which can quickly become a bottleneck if you don’t provide the necessary bandwidth to get data back and forth from the server.
From my point of view, it’s really important to understand that we are talking about data-crunching monsters here, with the focus on applications like HPC, Big Data, AI/ML and the like. These are not your average servers for VMs, not today at least. On the other hand, it is also true that this technology opens up many additional options for enterprise end users too, including the possibility to create larger VMs or consolidate more workloads on a single machine (with all its pros and cons).
Another feature which I thought noteworthy is the new set of instructions added to the new Xeon Platinum 9200 CPU for deep learning. We are far from having general purpose CPUs competing against GPUs, but the benchmark given during the presentations shows an incredible improvement in this area regarding Inference workloads (the process that is behind the day-to-day ML activity after the neural network is trained). Intel has done a great job, both on hardware and software. With more and more applications taking advantage of AI/ML to work, an increasing number of users will be able to benefit from it.

Closing the Circle
In this article, I’ve covered only a few aspects of these announcements, those that are more related to my job. There is much more to it, including IoT, edge computing, and security. I was especially impressed because they express a vision that is broad, clear and without doubts, providing answers for all the most challenging aspects of modern IT.
Most of the products presented during this announcement are focused on ultimate performance and efficiency or, at least, in finding the best compromise to serve next-gen data-hungry and high demanding applications. Something which is beyond the reach of many enterprises today and more in the ballpark of web and hyper scalers. That said, even if on a smaller scale, all enterprises are beginning to face these kinds of challenges and, no matter if the solutions come from the cloud or their on-prem infrastructure, the technology to do it is now more accessible than ever.