Stanford engineers create a software tool to reduce the cost of cloud computing
Just as Netflix uses an algorithm to recommend movies we ought to see, their system suggests how to use computing resources at data centers more efficiently.
We hear a lot about the future of computing in the cloud but not much about the efficiency of data centers, those facilities where clusters of server computers work together to host applications ranging from social networks to big data analytics.
Data centers cost millions of dollars to build and operate, and buying servers is the single largest expense. Yet at any given moment, most of the servers in a typical data center are only using 20 percent of their capacity. Why? Because the workload can vary greatly depending on factors such as how many users log on. Since data centers must always be ready to meet peak demand, having excess capacity is the best way to ensure this today.
But as cloud computing grows, so will the cost of keeping such large cushions of capacity. That’s why two Stanford engineers have created a cluster management tool that can triple server efficiency while delivering reliable service at all times, allowing data center operators to serve more customers for each dollar they invest.
Christos Kozyrakis, a professor of electrical engineering and computer science, and Christina Delimitrou, a doctoral student in electrical engineering, will explain their cluster management system, called Quasar, when scientists who design and run data centers meet for a conference that begins March 1.
Even in advance of that presentation, independent academic and industrial computer engineers familiar with their research praised Quasar.
“This is a proof of concept for an approach that could change the way we manage server clusters,” said Jason Mars, a computer science professor at the University of Michigan-Ann Arbor.
Stanford doctoral student Christina Delimitrou and Professor Christos Kozyrakis borrowed an idea from Netflix to improve data center efficiency. (Norbert von der Groeben)
Kushagra Vaid, general manager for cloud server engineering at Microsoft Corp., said that the largest data center operators have devised ways to manage their operations but that a great many smaller organizations haven’t.
“If you can double the amount of work you do with the same server footprint, it would give you the agility to grow your business fast,” said Vaid, who oversees a global operation with more than a million servers catering to more than a billion users.
How Quasar works takes some explaining, but one key ingredient is a sophisticated algorithm that is modeled on the way companies such as Netflix and Amazon recommend movies, books and other products to their customers.
How it works
To grasp what’s new about Quasar it’s helpful to think about how data centers are managed today.
Data centers run applications such as search and social media for consumers or data mining and large-scale data analysis for businesses. Each of these applications places different demands on the data center and requires different amounts of server capacity.
The cloud ecosystem includes developers who run applications, and cluster management tools that decide how to apportion the workload and assign which applications to which servers. Before making such assignments, the cluster managers typically ask developers how much capacity these applications will require. Developers reserve server capacity much as you might reserve a table at a restaurant.
“Today data centers are managed by a reservation system,” Stanford’s Kozyrakis explains. “Application developers estimate what resources they will need, and they reserve that server capacity.”
It’s easy to understand how a reservation system lends itself to excess idle capacity. Developers are likely to err on the side of caution. Because a typical data center runs many applications, the total of all those overestimates results in a lot of excess capacity.
Kozyrakis has been working with Delimitrou, a graduate student in his Stanford lab, to change this dynamic by moving away from the reservation system.
Instead of asking developers to estimate how much capacity they are likely to need, the Stanford system would start by asking what sort of performance their applications require. For instance, if an application involves queries from users, how quickly must the application respond and to how many users?
Under this approach the cluster manager would have to make sure there was enough server capacity in the data center to meet all these requirements.
“We want to switch from a reservation-based cluster management to a performance-based allocation of data center resources,” Kozyrakis said.
Quasar is designed to help cluster managers meet these performance goals while also using data center resources more efficiently. To create this tool the Stanford team borrowed a concept from the Netflix movie recommendation system.
If you liked this application . . .
Before delving into the algorithms behind Quasar, understand that servers, like people, can multitask. So the simplest way to increase server utilization would be to run several applications on the same server.
But multitasking doesn’t always make sense. A parent, for instance, might be able to wash dishes, watch television and still spell a word to help a child with homework. But if the question involved algebra, it might be wise to dry your hands, turn off the TV and look at the problem.
The same is true for software applications and servers. Sometimes they can coexist on the same server and still achieve their performance goals; other times they can’t.
Quasar automatically decides what type of servers to use for each application and how to multitask servers without compromising any specific task.
“Quasar recommends the minimum number of servers for each application and which applications can run best together,” Delimitrou said.
This isn’t easy.
Data centers host thousands of applications on many different types of servers. How does Quasar match the right applications with the right server resources?
By using a process known as collaborative filtering – the same technique that sites such as Netflix use to recommend shows that we might want to watch.
Collaborative filtering takes known facts, such as the movie viewing preferences of one group of subscribers, and uses that data to offer educated guesses about a future action, such as what a subscriber similar to the others might like to watch.
Applying this principle to data centers, the Quasar database knows how certain applications performed on certain types of servers. Through collaborative filtering, Quasar uses this knowledge to decide things such as how much server capacity to use to achieve a certain level of performance, and when it’s okay to multitask servers and still expect good results.
Thomas Wenisch, a computer science professor at the University of Michigan, said it was a conceptual breakthrough when the Stanford team first revealed last year how it planned to apply collaborative filtering to data center efficiency.
Now he is intrigued by the Quasar paper, in which Kozyrakis and Delimitrou show how they achieved utilization rates as high as 70 percent in a 200-server test bed, compared with the current typical 20 percent, while still meeting strict performance goals for each application.
“Part of the reason the Quasar paper is so convincing is that they have so much supporting data,” Wenisch said.
Next steps
Increasing data center efficiency will be essential for cloud computing to grow. Forget the cost of building new data centers. These installations draw so much electricity that escalating demand threatens to overtax power plant output. So throwing more servers into the data center isn’t the answer, even if money were no object.
But while they pursue higher efficiency from multitasking servers, data center operators must deliver consistent levels of service. They can’t allow some customers to suffer because the servers are processing the wrong mix of tasks, a shortcoming known as “tail latency.”
"The explosive growth of cloud computing is going to require more research like this,” said Partha Ranganathan, a principal engineer at Google who is on the team that is designing next-generation systems and data centers. “Focusing on resource management to address the twin challenges of energy efficiency and tail latency can have significant upside."
Kozyrakis and Delimitrou are currently improving Quasar to scale to data centers with tens of thousands of servers and manage applications that span multiple data centers.
“No matter how well we manage resources in one data center, there will always be cases that exceed its capacity,” Delimitrou said. “Offloading parts of work to other facilities in an efficient manner is key to achieving the flexibility that cloud computing promises.”
Tom Abate is associate director of communications at the School of Engineering.
Last modified Fri, 28 Feb, 2014 at 10:01