Managing Disk Usage Enforcement with Logical Volume Manager (LVM)

the ep s disk iplinary resource management n.w
1 / 15
Embed
Share

Explore how to enforce disk usage using the Logical Volume Manager (LVM) at the Execution Point (EP) for better resource management and job isolation. Learn about setting up LVM components, enabling disk enforcement, and configuring disk limits for efficient storage utilization within a Linux environment.

  • Resource Management
  • Disk Enforcement
  • Logical Volume Manager
  • Storage Management
  • Job Isolation

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. The EPs Disk-iplinary Resource Management Managing Storage at the Execution Point By: Cole Bollig Software Developer for CHTC Throughput Computing 2025

  2. Jobs are Guests Jobs are being permitted to execute on your resources. Don t want jobs to mess with other user jobs. Don t want jobs to mess up the host machine. June 4, 2025 1 Cole Bollig - HTCSS Developer

  3. Execution Point (EP) CPU & Memory enforced by cgroups CPU RAM (Memory) Now we can enforce disk usage! SSD (Disk) GPU June 4, 2025 2 Cole Bollig - HTCSS Developer

  4. How does the EP enforce disk usage? Using the Logical Volume Manager (LVM). The EP will create a unique ephemeral Logical Volume (LV) for each job the EP runs. Requirements 1. Linux OS 2. HTCondor running as root Wikipedia - Various elements of the LVM June 4, 2025 3 Cole Bollig - HTCSS Developer

  5. How to Enable Disk Enforcement? Add the configuration STARTD_ENFORCE_DISK_LIMITS = True May also want to set the following configuration options: LVM_BACKING_FILE_SIZE_M B (Defaults to 10GB) LVM_BACKING_FILE Note: The EP needs a restart to enable or disable LVM enforcement. Volume Group Physical Volume Loopback file HTCondor EP LVM Integration Documentation June 4, 2025 4 Cole Bollig - HTCSS Developer

  6. Setup LVM for the EP! If specified, the EP will use LVM components it is informed about rather than setting everything up itself. Inform the EP of Volume Group name to use via LVM_VOLUME_GROUP_NAME Volume Group Execution Point (EP) Physical Volume June 4, 2025 5 Cole Bollig - HTCSS Developer

  7. Example Provided Setup # pvcreate /dev/ssd1 /dev/ssd2 /dev/ssd3 # vgcreate condor_vg /dev/ssd1 /dev/ssd2 /dev/ssd3 Volume Group: condor_vg Physical Volume Physical Volume Physical Volume /dev/ssd2 /dev/ssd3 /dev/ssd1 # Sample configuration STARTD_ENFORCE_DISK_LIMITS = True LVM_VOLUME_GROUP_NAME = condor_vg June 4, 2025 6 Cole Bollig - HTCSS Developer

  8. Benefits of Ephemeral LVs 1. Improved Job Isolation 2. Improved Disk Management 3. Improved Reporting 4. Improved EP Efficiency 5. Data Encryption June 4, 2025 7 Cole Bollig - HTCSS Developer

  9. Benefit: Improved Job Isolation Each job runs in its own filesystem. The FS lives in the ephemeral LV created for the job. Make the FS only visible to the user job with LVM_HIDE_MOUNT Note: This option does not play well with VM or Docker Universe Jobs! Default is AUTO Execution Point (EP) Miron s Job Cole s Job Configuration Knob - LVM_HIDE_MOUNT Todd s Job June 4, 2025 8 Cole Bollig - HTCSS Developer

  10. Benefit: Improved Disk Management Jobs get what they ask for! A job can not run away and use more disk than requested. If a job uses all its assigned space, neither the host machine or other jobs will be affected. 97% 33% 12% 78% 100% June 4, 2025 9 Cole Bollig - HTCSS Developer

  11. Benefit: Improved Reporting HTCondor will monitor the LV usage to report over usage of disk. With a hard cap on disk space comes a greater risk of ENOSPC. Job put on hold with nice message rather than job appearing to randomly fail/crash. If I could have your new disk requests ASAP that would be great June 4, 2025 10 Cole Bollig - HTCSS Developer

  12. Benefit: Improved EP Efficiency No more pesky and slow traversal of the job s sandbox! Currently an EP must traverse the entire job sandbox to report the job s disk usage (counting each file manually) and to cleanup after the job. With an LV we can do a simple query for disk usage. Once the job is gone, we can simply delete the LV in one step. June 4, 2025 11 Cole Bollig - HTCSS Developer

  13. Benefit: Data Encryption The entire LV can be encrypted with cryptsetup Administrator can enable LV encryption for all jobs via configuration: ENCRYPT_EXECUTE_DIRECTORY Users can request encryption with the a submit command: encrypt_execute_directory Used to have encryption with eCryptFS before it was deprecated Sumbit Command - encrypt_execute_directory Configuration Knob - ENCRYPT_EXECUTE_DIRECTORY June 4, 2025 12 Cole Bollig - HTCSS Developer

  14. Thin Vs Thick Provisioning Thin Provisioning Disk is provisioned as needed Allows the EP to overprovision the LV Better over usage reporting to user Requires backing thinpool LV Thick Provisioning Disk is provisioned at LV creation time Just requires a Volume Group LVM_USE_THIN_PROVISIONING Thick LV Thin LV June 4, 2025 13 Cole Bollig - HTCSS Developer

  15. Acknowledgement This work is supported by NSF under Cooperative Agreement OAC-2030508 as part of the PATh Project. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF. June 4, 2025 14 Cole Bollig - HTCSS Developer

More Related Content