I embarrassed myself the last time I asked a question about Nutanix on this board, and took Saul’s words about “doing the work” to heart. I remember seeing a link to the NTNX Bible (huge tip to CMFALieberman) a few weeks ago, which is the company blueprint. In attempts to better understand what Nutanix actually does, I went ahead and read it. At over 300 pages it is incredibly dense, and I’ve only included notes regarding the company’s thesis, not the hundreds and hundreds of pages detailing how the software works button by button, and coding line by coding line.
I’ve included some backstory about mainframe computers and kernel space, as such particular information was necessary to absorb before truly grasping the necessary ins and outs of the company. My apologies for the length, and if this has already been covered on this board. I know Nutanix has been heavily discussed here, but I couldn’t find something similar to what I’ve done below.
A Quick History on “The Cloud"
-
In the beginning of computers, all data lived on a large mainframe. The only way to access the data was to use that large machine, and only the person using that machine had access to the information. This created a “silo”, or information walled off from other users/machines.
-
Over time, access to the data got smaller (laptops) and more portable (external hard drives/phones). However, only the person using that device had access to the information. Silos still remained.
-
Data is now moving to the cloud, which allows users to share and access that data any location on Earth, provided they have an internet connection. This removes silos, and is being adapted by companies all across the globe.
Pillars
The core pillars of any cloud service are:
- Self Service (the ability for a person to easily access the data)
- Service Level Agreements (SLA) (this guarantees that the cloud remains active and operational, and at a certain speed)
- Fractional Consumption Models (while some services are free, such as say Gmail, other services can be billed out to the customer)
Cloud Classifications
The last pillar above (ie, the money making one) can subsequently be broken down into three classifications. These are:
- Software as a Service (Microsoft Word)
- Platform as a Service (Apple’s App Store)
- Infrastructure as a Service (this would be the brain and central nervous system, keeping all things in your company working properly).
Cloud Latency
Faster is better, especially on the Cloud. The more users accessing the network, the more memory (RAM) the server must posses to avoid latency. Servers alleviate latency two different ways:
Kernel Space
- The most priviliged part of the Operating System (OS)
- Handles memory management
- Contains the physical device drivers
User Space
- Basically “everything else”
- Handles things such as opening Microsoft Word
When the user requests something done (ie, save the latest presentation on Power Point), the system routes that request to either the Kernel Space or the User Space, and the task is processed. This of course takes memory (or lets also call it computational power) away from other programs also running, thus latency is introduced. The more complicated the request, the more latency. The more requests happening at the same time, the more latency. This is why the first solution for a computer running slow is to “close all other programs” as those programs might all be sending requests to the kernel/user space, thus depleting the overall computational power of the device.
A simple flow path of this is as follows:
I WANT TO DO SOMETHING > COMPUTER SENDS THAT MESSAGE TO THE KERNEL > THE KERNEL STOPS WHAT IT IS DOING > THE KERNEL EXECUTES YOUR DESIRE > YOU GET THE RESULT
These requests can be handled in two separate ways:
Polling
- Constantly asking the kernel/user space if it needs anything
- Requires constant CPU, but much lower latency
Interrupt
- Only acts when given a direction
- Tells Kernel to stop what it is doing and execute your tasks
- Much higher latency
As a result of this, Operating Systems are being written more towards Polling than Interrupting, which results in this flow path:
I WANT TO DO SOMETHING > COMPUTER WAS ALREADY CONSTANTLY CHECKING IF YOU WANTED TO DO SOMETHING, AND HOLDING THAT MEMORY TO EXECUTE IT > YOU GET THE RESULT
In even simpler terms, kernel space is the CEO and doesn’t need to be bothered with high volumes of mundane tasks that lower level employees (User Space) can handle. This frees up the CEO to put all their efforts towards solving the biggest problems of the day.
All that said, we are now getting into WHAT Nutanix does.
Hyper-Convergence
Boiled down, hyper-convergence is natively combining two or more separate components (say the graphics card and the cooling fan motor) into a single unity thus reducing computational power required. For example, when the graphics card on your computer goes into overdrive to display a large visual element, the onboard cooling fan motor natively speeds up and better cools the system core. Taken directly from the bible, “In the case of Nutanix, we natively converge computational tasks to form a single node used in our application.”
Software-Defined Intelligence
Taken directly from the bible, “Software-defined intelligence is taking the core logic (ie, its central nervous system) away from specialized hardware (ie, an application specific chip such as one that manages only the microphone in your computer) and doing it in software in commodity hardware (widely available, not specialized). At Nutanix, we take traditional logic and put that into software that runs on standard hardware.”
In layman’s terms, Nutanix software does what complicated hardware used to do on widely accessible and affordable hardware.
The benefits of this to a business are as follows:
-
Rapid release cycles (software can be updated constantly and instantly, instead of waiting for new hardware product cycles)
-
Eliminates the need for fancy, specialized (ie, expensive) hardware
-
Lifespan investment protection (ie, you can run new software on old hardware. Think about how you are able to update your computer operating system without changing out hardware components).
Distributed Autonomous Systems
Taken directly from the bible, “distributed autonomous systems involve moving away from the traditional concept of having a single unit (ie, motherboard or graphics card) responsible for doing something and distributing that role among all nodes within the cluster. Traditionally, vendors have assumed that hardware will be reliable, which, in most cases is true. However, hardware will eventually fail and handling that fault in an elegant and non-disruptive way is key. These distributed systems are designed to accommodate and remediate failure, to form something that is self-healing and autonomous. In the event of a component (ie, hardware) failure, the system will transparently handle and remediate the failure, continuing to operate as expected.”
In layman’s terms, hardware failure is a huge pain in the ass so why not use software that is cheaper, more scalable, redundant and consistently checks up on, and fixes, itself automatically?
Incremental and Linear Scale Out
Taken directly from the bible, “Incremental and linear scale out relates to the ability to start with a certain set of resources and, as needed, scale them out while linearly increasing the performance of the system. Traditionally [with hardware], you’d have 3-layers of components: servers, storage, and network – all of which are scaled independently. As an example, when you scale out the number of servers you’re not necessarily scaling out your storage performance. With a hyper-converged platform like Nutanix, however, you have the ability to improve all three at the same time.”
In layman’s terms, think about this like a traditional gas engine versus the software running a Tesla. Swapping a part on a gas engine does not mean that all other pre-existing parts of that engine are now running a peak capacity. On a Tesla, however, when you update the software, you optimize the entire operational system of the vehicle.
Hopefully all of the above better defines what Nutanix does. It was helpful for me.
Brandon