Much to my great joy, the X-Files is currently available on Amazon Prime and so to unwind a bit and relax, I’ll periodically watch an episode, sometimes 2, after the young kids go to bed. The seventh episode of the first season is entitled, “Ghost in the Machine” and in this episode, a revolutionary computer called the Central Operating System (COS) has been created by Brad Wilczek, founder of the software company Eurisko. COS is a smart computer in that its intelligence enables it to manage the Eurisko building, however, COS is not making Eurisko any money so the CEO of Eurisko, Benjamin Drake, decides to terminate the program over the objections of Wilczek. As you might imagine, COS “hears” this and takes steps to ensure its continued existence by deleting the “file” that was Benjamin Drake.
After investigating, Mulder suspects COS is behind the suspicious behavior. He reveals his thoughts to Scully and after an exchange about artificial intelligence and adaptive networks, a doubtful Scully exclaims, “C’Mon Mulder! That kind of technology won’t be available for decades!” In that moment, much to the disappointment to those in the room with me, I paused the show in order to share some of what I heard from Jeff Aaron, @jeffreysaaron and Osman Sarood, @osarood of Mist Systems (now a Juniper Company) at Cloud Field Day 5.
Who is Mist Systems?
Mist Systems, headquartered in Cupertino, CA was founded in 2014 by Sujai Hajela and Bob Friday, both former Cisco executives, with a vision to build a wireless network for wireless devices, to expand wireless networking beyond the use case of providing a network connection, but to enhance and simplify wireless networking through the use Artificial Intelligence, Machine Learning, and Location Services to provide an enhanced and personalized user experience. According to an article written by Julie Bort, @Julie188 for Business Insider, the inspiration for this new wireless network came from Hajela’s daughter. Bort writes:
In 2014, before they tried to drum up a Series A venture capital investment, Hajela was talking to his daughter about his newfangled Wi-Fi product idea.
She told him, “Dad, that’s too technical.”
She wanted a network that simply put information about wherever she was at her fingertips.
“When she went to a mall, an amusement park, a museum or any social setting, she wanted the place to literally ‘talk to her’ about all the services it has to offer, with a high level of personalization, and not spam,” he says.
And the idea of a simple, agile, scalable, intelligent, resilient, and programmable network designed to support business innovation, mobility, and an optimal user experience resonates with customers. In just 2 years of shipping, Mist is doing business with 3 of the Fortune 10, over 20 of the Fortune 500, retailers, etailers, pharmaceuticals, airports, airlines, healthcare, higher education, manufacturing, and technology firms. Their acquisition by Juniper further proves Mist’s success and enables Juniper/Mist to incorporate simplicity, agility, intelligence, resilience, and programmability into the wired network.
AI for IT Requires the Right Cloud Architecture
Mist is endeavoring to usher in an new era of IT, the AI-driven enterprise in which the traditional manual and reactive model is replaced with an automated and proactive model that focuses on the user experience. From day 1, Mist sought to make this question easy to answer…How can I determine what’s happening on the network? But even beyond that, Mist sought to use the cloud, specifically AI and ML to not only to perform event correlation and anomaly detection, but to diagnose issues and resolve them without human interaction using their Virtual Network Assistant (VNA) Interface Engine thus empowering a self-driving, self-healing network. The VNA Interface Engine, backed by AI/ML residing in the cloud, is available to Mist customers as a SaaS offering.
Creating an intelligent and self-driving network requires data, lots of data. And to that end the VNA Interface Engine receives 150 data points every 2 seconds from wireless access points and mobile clients located throughout the world…literally billions of messages are received, processed, and analyzed by the AI backing the VNA Interface Engine. Within the AWS environment, approximately 400TB of data is transferred per day to perform the AI/ML tasks, and both Jeff and Osman made it clear that a solid cloud infrastructure is vital to perform these tasks and offer this service. So Mist set out to answer another question, “How can we build a strong cloud infrastructure that is cost effective, reliable, and scalable?”
Using AWS Spot Instances to Build a Cost-Effective, Reliable, and Scalable AI/ML Platform
Though there are plans to offer the VNA Interface Engine through multiple public clouds such as Google and Azure, its offered today on AWS. What I found amazing is that Osman built VNA’s underlying cloud architecture using spot instances…yes, Mist built a cost effective, reliable, and scalable cloud infrastructure for their AI/ML platform using unreliable spot instances! Ultimately, Osman built the cloud infrastructure using spot instances because it forces reliability and provides huge cost savings, up to 80% cheaper than an equivalent on-demand or reserved instance. When using spot instances, you will find out very quickly what portion of your code is not resilient.
During the presentation, Osman stated the following:
Apps are going to fail, we have to accept it. What we have to do is come up with ways to accept failures and write software that resists failure. In the cloud, you have to embrace unreliability and make reliable systems.
Due to some questions and the time constraint placed upon all CFD presentations, Osman was unable to fully discuss the use of checkpointing to deal with spot’s inherent unreliability, examining realtime lag, recovering from spot instance termination, and monitoring. Though I cannot speak for all of the CFD5 delegates, I feel I can say with confidence we were all impressed by the work Osman and his team completed in building the infrastructure to support Mist’s AI/ML capabilities.
To close, let me bullet point a few more talking points from Osman:
- 80% of the total cloud infrastructure, and 100% of the instances required to support the anomaly detection workflow, run on spot instances. But is it reliable? One day this past March, 800 spot instances were terminated….and no one noticed anything.
- If you’re even average at math, you can easily deduce that Mist’s AWS AI/ML environment consists of approximately 20% non-spot instances. Though a relatively low percentage in regards to the total number of instances, they constitute 52% of AWS cost.
- In a very honest moment, Osman shared that the first time spot instances were used, an entire cluster was lost when the instances were terminated. It taught Mist a valuable lesson on using spot instances effectively…Do not pick one instance type, but diversify across spot markets and overprovision to account for spot instance termination/creation time cycles.
Final Thoughts
The ultimate goal of Mist’s AI/ML infrastructure is to enable an intelligent, self-driving network. Jeff acknowledged that the self-driving network won’t be here overnight; that Mist will have to establish trust, they’ll need to prove that their AI model is making the right decisions before network administrators willfully, and perhaps joyfully, hand over the keys to the network. But Mist is well on their way to harnessing intelligence to adapt the network for business use and when that day comes, organizations can begin to focus less on the network and more on innovation. To me, the VNA Interface Engine is a representation of the seemingly endless potential made possible through cloud computing.
Additional Resources
In addition to the links in the post, check out the following resources:
Juniper/Mist Cloud Field Day 5 Presentations