Improving VM Network I/O Performance

Virtual machine (VM) consolidation has become a common practice in clouds, Grids, and datacenters. While this practice leads to higher CPU utilization, we observe its negative impact on performance of applications in VMs. As more VMs share the same core/CPU, the CPU scheduling latency for each VM increases significantly. On one hand, such increase leads to slower progress of TCP transmissions to the VMs; on another side, increased CPU access latency experienced by each VM translates into longer I/O processing latency perceived by I/O bound applications. To mitigate such negative impact, our group proposed vFlood, vSnoop and vSlicer. vFlood and vSnoop focus on increasing the TCP throughput of VMs. vSlicer solves performance problems of I/O bound applications, which are produced by longer CPU access latency experienced by VMs. Our evaluation results show significant improvements in application-specific performance.

vSnoop: increasing CPU scheduling latency for each VM leads to slower progress of TCP transmissions to the VMs. vSnoop allows the driver domain of a host acknowledges TCP packets on behalf of the guest VMs – whenever it is safe to do so. Our evaluation of a Xen-based prototype indicates that vSnoop consistently achieves TCP throughput improvement for VMs (of orders of magnitude in some scenarios). Our evaluation also shows that the higher TCP throughput leads to improvement in application level performance, via experiments with a two-tier online auction application and two suites of MPI benchmarks.

vFlood: The vFlood approach complements our approach of vSnoop. vFlood is a sender-side approach, which mitigates the impact of VM consolidation by offloading a portion of a sender VM's TCP stack to the driver domain in order to optimize the transmit path of the TCP connection. More specifically, the sender VM will opportunistically “flood” TCP packets to the driver domain, which will in turn transmit the packets to the receiver while performing congestion control on behalf of the sender VM. Their evaluation results show significant improvement in both TCP throughput and application-specific performance.

vSlicer: as more VMs share the same core/CPU, the CPU access latency experienced by each VM increases substantially, which translates into longer I/O processing latency perceived by I/O bound applications. vSlicer is a hypervisor level technique. It enables a new class of VMs called latency-sensitive VMs (LSVMs) by scheduling each LSVM more frequently but with a smaller micro time slice. LSVM could achieve better performance for I/O-bound applications while maintaining the same resource share (and thus cost) as other CPU sharing VMs. vSlicer enables more timely processing of I/O events by LSVMs, without violating the CPU share fairness among all sharing VMs. Our evaluation of a vSlicer prototype in Xen shows that vSlicer substantially reduces network packet round-trip times and jitter and improves application-level performance.

Publications

  • “vSnoop: Improving TCP Throughput in Virtualized Environments via Acknowledgment Offload”. Ardalan Kangarlou, Sahan Gamage, Ramana Rao Kompella, and Dongyan Xu. ACM/IEEE Supercomputing 2010 (SC10), New Orleans, LA, November 2010 [Best Student Paper Finalist].
  • “Opportunistic Flooding to Improve TCP Transmit Performance in Virtualized Clouds”. Sahan Gamage, Ardalan Kangarlou, Ramana Rao Kompella, and Dongyan Xu. ACM Symposium on Cloud Computing (SOCC'11), Cascais, Portugal, Oct 2011 [Paper of Distinction].
  • “vSlicer: Latency-Aware Virtual Machine Scheduling via Differentiated-Frequency CPU Slicing”. Cong Xu, Sahan Gamage, Pawan N. Rao, Ardalan Kangarlou, Ramana R. Kompella, Dongyan Xu. Proceedings of the 21st ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2012), Delft, the Netherlands, June 2012 [Nominee for Best Paper and Best Presentation Awards].
  • “vSnoop: Improving TCP Throughput in Virtualized Environments via Acknowledgment Offload”. Sahan Gamage, Ardalan Kangarlou, Ramana Rao Kompella, and Dongyan Xu. Poster in 7th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2010), April 2010. (NSDI 2010)
  • “vFlood: Opportunistic Flooding to Improve TCP Transmit Performance in Virtualized Clouds”. Sahan Gamage, Ardalan Kangarlou, Ramana Rao Kompella, and Dongyan Xu. 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI'11), Boston, MA, Mar 2011.

Software

We are planning to release the vSnoop and vFlood software in the near future.

People

Sponsors

This project has been sponsored in part by the National Science Foundation (NSF).

 
vsnoop.txt · Last modified: 2012/07/18 02:25 by akangarl
 
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki