Split Migration of Large Memory Virtual Machines
Recent IaaS clouds offer virtual machines with large memory capacities, such as Amazon EC2's X1 instances with 2 TB, crucial for big data analytics. However, migrating large memory VMs poses challenges due to the difficulty and cost inefficiencies involved. This migration issue can lead to disruptions in big data analysis processes, particularly when dealing with in-memory databases like SAP HANA. Various migration strategies are explored, including virtual memory migration, remote paging, and a Split migration approach that involves utilizing multiple hosts to manage a large memory VM effectively.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Split Migration of Large Memory Virtual Machines Masato Suetake Kenichi Kourai Hazuki Kizu Kyushu Institute of Technology 1
Large Memory VMs Recent IaaS clouds provide virtual machines (VMs) with a large amount of memory E.g., new X1 instances (2 TB) in Amazon EC2 Such VMs are required for Big data analysis using Apache Spark In-memory database, e.g., SAP HANA VM VM VM VM VM 2
Migration of Large Memory VMs Large memory VMs make VM migration difficult Not cost-efficient to always reserve hosts with a large amount of free memory If they cannot be migrated... Big data analysis is disrupted for a long time The whole data in memory is lost after restart source host destination host migration free memory VM core 1 TB 2 TB 3
VM Migration with Virtual Memory Virtual memory allows a larger amount of memory than physical memory Incompatible with VM migration Page-outs occur regardless of VM's access pattern (1st iteration) Read-only pages tend to be paged out Degrade performance during/after VM migration source host destination host migration disk VM core VM core 1TB 1TB 2TB 4
VM Migration with Remote Paging Remote paging can use multiple hosts with a small amount of free memory May be faster than paging with local disks Also incompatible with VM migration Even pages stored in swap hosts are transferred via the destination host The network bandwidth is consumed source host destination host swap host VM core migration free memory VM core 2 TB 1 TB paging 5
S-memV Split migration Migrate a large memory VM using multiple hosts One main host for running a VM Zero or more sub-hosts for providing memory Divide VM's memory using its access pattern Remote paging Swap pages between the main and sub-hosts main host main host VM core VM core sub-host migration memory memory memory 6
One-to-N Migration Migrate a VM to multiple hosts To the main host VM's core information (CPU/device states) frequently accessed pages To the sub-hosts Pages that cannot be accommodated in the main host The transfers are done in parallel source host main host sub-host VM core migration VM core 1 TB 2 TB 1 TB migration 7
Aware of Remote Paging Not occur at all during VM migration Each page is directly transferred to either the main or sub-host Less likely to occur just after VM migration Frequently accessed pages are stored in the memory of the main host Depending on the working set VM core sub-host main host memory working set paging memory 8
N-to-One Migration Migrate a VM from multiple hosts to one From the main host Normal migration except for non-existent pages From the sub-hosts Simple memory transfer Transfer pages without redundancy or omission Even for those paged in/out during migration destination host sub-host main host VM core VM core migration 1 TB 2 TB 1 TB paging migration 9
Partial Migration Migrate the whole/part of a VM across multiple hosts to different hosts From the main host One-to-N migration of a VM with partial memory From sub-hosts Only memory transfer to the destination sub-hosts sub-host new sub-host new main host main host VM core VM core 1 TB 512 GB 512 GB 1 TB paging paging 10
System Architecture of S-memV QEMU-KVM at the main host Support one-to-N migration Maintain the page location of a VM Run a VM with remote paging Memory servers at the sub-hosts Manage part of the memory of a VM Handle page-in/-out requests Host management server Choose sub-hosts main host VM core memory QEMU-KVM sub host memory memory server 11
Collecting Memory Access Data S-memV keeps track of memory access inside a VM Examine access bits in the extended page tables (EPT) for the VM Use the collected access history for Split migration Recently used pages are to the main host Remote paging Least recently used pages are paged out VM QEMU-KVM Linux KVM EPT 12
Remote Paging with userfaultfd QEMU-KVM receives an event when a VM accesses a non-existent page Using userfaultfd introduced in Linux 4.3 It sends a page-in request to a sub-host Write received data to the faulting page Send a page-out request later to the sub-host main host sub-host VM page out fault memory page in event QEMU-KVM Linux kernel memory server paging request 13
Experiments We examined the performance of split migration Baseline: VM migration with sufficient memory Comparison: VM migration with virtual memory We used a VM with 1 vCPU and 2 GB of memory destination source host main host Xeon E3-1270v2 2 GB or 4 GB (~1 GB used) SATA 600 GB Linux 4.3 QEMU-KVM 2.4.1 sub-host Intel Xeon E5640 CPU Memory HDD OS Virtualization Xeon E3-1270v3 16 GB - 14
Migration Performance (Idle) We measured performance for an idle VM VM migration with virtual memory 87% longer migration time / 2.9x longer downtime Large degradation even in fewer memory re-transfers Split migration 17% longer migration time / 0.1s longer downtime Performance degradation was suppressed 50? migra on? me sec 2.0? ? sufficient? memory? split? migra on? virtual? memory? sufficient? memory? split? migra on? virtual? memory? down me sec 40? 1.5? ? 30? 1.0? ? 20? 0.5? ? 10? 15 0.0? ? 0? 1?
Migration Performance (Busy) We stressed memcached in a VM VM migration with virtual memory 5.4x longer migration time / 3.6x longer downtime The variance was very large due to paging Split migration 17% longer migration time / 49% shorter downtime The reason of shorter downtime is under investigation 300? migra on? me sec 3.5? ? sufficient? memory? split? migra on? virtual? memory? sufficient? memory? split? migra on? virtual? memory? down me sec 3.0? ? 250? 2.5? ? 200? 2.0? ? 150? 1.5? ? 100? 1.0? ? 50? 0.5? ? 16 0.0? ? 0?
Collection of Memory Access Data We measured the time for collecting access data on VM's memory It took more time when more pages were used 3 ms for 2 GB of memory The overhead is 0.3% if data is collected every second Estimation 3s for 2 TB of memory? Probably less time EPT shrinks when pages are not accessed 3.5? ? idle? 3.0? ? collec on? me ms memcached? 2.5? ? 2.0? ? 1.5? ? 1.0? ? 0.5? ? 0.0? ? 17
VM Performance after Migration We estimated the performance of a VM with remote paging from [12] Baseline: performance in sufficient memory Quick sort is 1.5-2 times slower The working set is much larger than local memory Barnes is almost not degraded The working set is slightly larger than local memory 2.5? ? InfiniBand? Gigabit? Ethrnet? 2.0? ? slow? down? ? 1.5? ? 1.0? ? 0.5? ? 0.0? ? 18 quick? sort? Barnes?
Related Work Post-copy migration [Hines+ VEE'09] Special case of one-to-N migration Need two hosts with a large amount of memory Scatter-Gather migration [Deshpande+ CLOUD'14] Similar to one-to-N migration Finally transfer the whole memory to one host MemX [Deshpande+ ICPP'10] Run a VM using the memory of multiple hosts Support inflexible partial migration 19
Conclusion Split migration Divide the memory of a large memory VM Directly migrate the pieces using multiple hosts Aware of remote paging Achieve fast VM migration Keep VM performance after migration S-memV supports one-to-N migration The performance was comparable to VM migration with sufficient memory Much better than using virtual memory 20
Future Work Integrate several mechanisms into S-memV Collecting memory access data of VMs Remote paging Evaluate S-memV Show that page-ins/outs are reduced Support N-to-one and partial migration Need to synchronize multiple source hosts Recover from failures during split migration Switch only failed destination hosts to others 21