Reboot the cloud? Yes, it has happened, and here’s why.

Reading time: 4 min
Share this Share on email Share on twitter Share on linkedin Share on facebook

While rare, every now and then, major cloud providers such as Amazon must ponder interrupting service to reboot parts of their environments. It is a curious thing, and leads to asking, “Why?”

As with many things in technology, we are often lead to believe that the “new” thing, be it a hypervisor, a mobile operating system, an Internet-of-Things device, or what have you, are impenetrable things which are somehow beyond the reach and attention of attackers. This always proves to be a honeymoon period. Once there are enough of the “new thing” around to make it worthwhile, attackers will, inevitably, find ways to exploit vulnerabilities.

In the case of anticipated reboots of cloud systems hosted by Amazon and Rackspace (among others), vulnerabilities in the Xen hypervisor that underlie their services are the culprit.

In large, virtualized environments, applying patches to hosts can be done live. It’s not that hosts don’t need to be rebooted, but rather, it’s that live guest instance can be migrated off of a target host. Once ‘empty’, the host is patched and rebooted. Live instance can be returned, and the next host is then treated to the same routine.

We are used to hearing about vulnerabilities in applications and operating systems. The ubiquitous nature of Adobe applications and Microsoft operating systems makes them ideal targets. Today, virtualization is also ubiquitous. It stands to reason that hypervisors and virtualization management tools are also targets.

Indeed, announcements of vulnerabilities in hypervisors is not new, and will continue. That should not be surprising.

However, vulnerabilities in hypervisors are disconcerting. If a hypervisor can be compromised, that could give an attacker access to every operating system running on that hypervisor. With full control, and attacker wouldn’t need to compromise the operating systems (though, it would be trivial) since the hypervisor controls all hardware access on behalf of the operating systems – that’s pretty-much the definition of a hypervisor.

As an example, researchers at Bitdefender, led by Andrei Luțaș, identified a vulnerability in Xen. That vulnerability may, in-part, have contributed to an Amazon reboot in September, 2014. Of course, we can’t know for-sure as Amazon is, wisely, not about to share that level of information. It may be that they used the occasion to address multiple vulnerabilities (akin to a Microsoft patch Tuesday). Also, while the fix may have been applied across Amazon’s cloud offering, they indicated that only about ten percent of their EC2 fleet would be affected by the reboot (presumably since most instances can be migrated to another host while a particular host is patched and restarted).

Those who are curious about what a Xen vulnerability looks like, in great detail, can read a whitepaper authored by Andrei Luțaș, here. This particular vulnerability had to do with local privilege escalation within a guest operating system running on Xen.