Last post I covered Paragon, which is a QoS aware resource scheduler. In this paper, the same authors extended Paragon to improve cluster utilization efficiency either on-prem or in the cloud. Background It's a well-known fact that everyone using the cloud is wasting most of it's capacity. In this paper, the authors analyzed a production cluster from … Continue reading Quasar: Resource-Efficient and QoS-Aware Cluster Management
Category: scheduling
Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters
After a long pause (I blame it on starting a startup...), I'd like to continue the cluster scheduling series that I started in 2015! Today's post I'd like to cover Paragon, a cluster scheduler that is Quality of Service aware that utilizes machine learning to help its service placement decision. This is work that was … Continue reading Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters
Hierarchical Scheduling for Diverse Datacenter Workloads
Hierarchical Scheduling for Diverse Datacenter Workloads In this post we’ll cover the paper that introduced HDRF (Hierarchical Dominant Resource Fairness) which builds upon the team's existing work DRF (Dominant Resource Fairness), but looking to also provide hierarchical scheduling. Background Prior work DRF, was an algorithm that was able to decide how to allocate multi-dimensional resources … Continue reading Hierarchical Scheduling for Diverse Datacenter Workloads
Sparrow : Scalable Scheduling for Sub-Second Parallel Jobs
Sparrow : Scalable Scheduling for Sub-Second Parallel Jobs Background In the previous posts around datacenter scheduling, most of the focus has been long running services or batch jobs that runs from minutes to days. Sparrow is looking to solve a different use case, where it looks to solve the scheduling problem when placing jobs that runs … Continue reading Sparrow : Scalable Scheduling for Sub-Second Parallel Jobs