Building a more resourceful cloud
- — 15 June, 2011 07:47
While managers may anticipate how cloud computing will one day ease IT headaches, the purveyors of cloud services themselves still need to further fine-tune the way their cloud services are metered and managed in order to make truly flexible cloud computing a reality.
The first round of papers presented at the Usenix "HotCloud 2011" Workshop on Hot Topics in Cloud Computing, held this week in Portland, Oregon, focused on exploring new approaches in scheduling workloads in the cloud that could benefit both cloud users and providers.
Researchers discussed how to define cloud compute jobs by the hardware they need, and how pricing could be made more flexible than today's static schedulers will allow.
Job scheduling, in which multiple compute jobs are balanced across a single compute node, is not a new issue in IT, but its importance is paramount in cloud computing.
"The physical resources to actually implement a cloud are incredibly expensive, and so, having built it, you want to pack in as much work as possible into that infrastructure. Hence scheduling becomes really important," said David Maltz, a Microsoft Research researcher, who helped organize the workshop.
How work should get scheduled in a cloud in an equitable manner, however, raises a whole new set of questions and challenges, the presenters pointed out.
For instance, different customers may want different capabilities, noted Gunho Lee, a researcher at the University of California, Berkeley. One job may require a great deal of CPU usage, while another application may chiefly need memory space. Likewise, some servers are built for maximum CPU performance, and others are more memory-focused.
As a result of these differing desired characteristics, "we need to track job affinity to determine which type of machine offers the most suitable cost-performance trade-off for a job," stated Lee's paper, which he co-wrote with the University of California's Randy Katz and Yahoo Research's Byung-Gon Chun.
The researchers proposed each cloud should have two types of machines. One set of servers would be considered "core nodes," which would do the basic chores, while another set would be considered "accelerator nodes" that can be temporarily added to help with executing the more computationally intensive workloads.
Users would submit jobs to what the researchers call a "cloud driver," which would allocate the appropriate number of nodes to the task.
Other variables that cloud computing services should take into account include if the customer wants a job executed quickly or at the lowest cost possible, or some combination in between, noted Damien Zufferey, one of the researchers from the Institute of Science and Technology Austria (IST) who developed a technique that would allow users to trade off between speed and the cost of running a job. The faster the customer wants the job to execute, the more that customer should pay; whereas economy jobs may take longer to finish, he argued.
This IST research group developed a prototype scheduling system called Flextic, which can take these differences into account. Key to Flextic's approach is a set of desired operational characteristics submitted by the user, such as how much data will be processed and the maximum amount of time the job should take to execute. "We want to be able to ask some information about a job we are doing," Zufferey said.
Scheduling can be affected by other factors as well. For instance, when you move a virtual machine from one server to another while it is still running, it can slow down all the other virtual machines running on each of those servers, said Seung-Hwan Lim, a Pennsylvania State University researcher who co-authored a paper on this topic with Chita R. Das, of Pennsylvania State University, and Jae-Seok Huh and Youngjae Kim of Oak Ridge National Laboratory.
These researchers looked at ways of quantifying the overall effect that moving a virtual machine would have on its surrounding workloads. "In order to have robust and predictable performance, we need migration-aware schedulers," Lim said.