An opportunistic HTCondor pool inside an interactive-friendly Kubernetes cluster

Slide Note

At HTCondor Week 2019, Igor Sfiligoi from UCSD presented on an opportunistic HTCondor setup within an interactive-friendly Kubernetes cluster. This innovative approach showcased the utilization of HTCondor pools in a user-friendly environment. Learn more about this topic discussed in May 2019.

dadr Follow

Uploaded on Mar 01, 2025 | 2 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

At HTCondor Week 2019 Presented by Igor Sfiligoi, UCSD for the PRP team An opportunistic An opportunistic HTCondor inside an interactive inside an interactive- -friendly Kubernetes cluster Kubernetes cluster HTCondor pool friendly pool HTCondor Week, May 2019 1

Outline Outline Where do I come from? What we did? How is it working? Looking head HTCondor Week, May 2019 2

The Pacific Research Platform The PRP originally created as a regional networking project Establishing end-to-end links between 10Gbps and 100Gbps (GDC) HTCondor Week, May 2019 3

The Pacific Research Platform The PRP originally created as a regional networking project Establishing end-to-end links between 10Gbps and 100Gbps Expanded nationally since And beyond, too UWashington 40G 192TB I2 Chicago UIC NCAR-WY 40G FIONA 40G FIONA1 40G 160TB I2 NYC UIUC I2 Kansas City 40G FIONA 40G FIONA 40G FIONA Netherlands KISTI UvA 10G 35TB 10G 35TB FIONA6 PRP Korea U Hawaii Guam 40G 160TB U of Guam 10G 35TB Singapore CENIC/PW Link U of Queensland 10G 35TB PRPv2 Nautilus Transoceanic Nodes Asian Pacific RP Transoceanic Nodes Australia HTCondor Week, May 2019 4

The Pacific Research Platform Recently the PRP evolved in a major resource provider, too Because scientists really need more than bandwidth tests They need to share their data at high speed and compute on it, too The PRP now also provides Extensive compute power About 330 GPUs and 3.5k CPU cores A large distributed storage area - About 2 PBytes Select user communities now directly use all the resources PRP has to offer Still doing all the network R&D in the same setup, too We call it the Nautilus cluster HTCondor Week, May 2019 5

Kubernetes as a resource manager Industry standard Large and active development and support community Container based More freedom for users Flexible scheduling Allows for easy mixing of service and user workloads HTCondor Week, May 2019 6

Designed for interactive use Users expect to get what they need when they need it Makes for very happy users Congestion happens only very rarely And is typically short in duration HTCondor Week, May 2019 7

Opportunistic use Time for opportunistic use Idle compute resources No congestion HTCondor Week, May 2019 8

Kubernetes priorities Priorities natively supported in Kubernetes Low priority pods only start if no demand from higher priority ones Preemption out of the box Low priority pods killed the moment a high priority pod needs the resources Perfect for opportunistic use Just keep enough low-priority pods in the system https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/ HTCondor Week, May 2019 9

HTCondor as the OSG helper PRP wanted to give opportunistic resources to Open Science Grid (OSG) users Since they can tolerate preemption But OSG does not have native support for Kubernetes Supports only resources provided by batch systems We thus instantiated an HTCondor pool As a fully Kubernetes/Containerized deployment HTCondor Week, May 2019 10

HTCondor in a (set of) container(s) Putting HTCondor in a set of containers is not hard Just create an image with HTCondor binaries in it! Configuration injected through Kubernetes pod config HTCondor deals nicely with ephemeral IPs The Collector must be discoverable Kubernetes service Everything else just works from there Persistency needed for the Schedd(s) And potentially for the Negotiator, if long term accounting desired Everything else can live off ephemeral storage HTCondor Week, May 2019 11

Service vs Opportunistic Collector and Schedd(s) deployed as high priority service pods Should be running at all times Few pods, not high CPU or GPU users, so OK Using Kubernetes Deployment to re-start the pods in case of HW problems and/or maintenance Kubernetes Service used to get a persistent routing IP to the collector pod Startds deployed as low priority pods Pure opportunistic Hundreds of pods in the Kubernetes queue at all times, many in Pending state HTCondor Startd configured to accept jobs as soon as it starts and forever after If pod preempted, HTCondor gets a SIGTERM and has a few seconds to go away HTCondor Week, May 2019 12

Then came the users Everything was working nicely, until we let in real users Well, until we had more than a single user OSG users got used to rely on Containers So they can use any weird software they like But HTCondor Startd already running inside a container! Cannot launch a user-provided container Not without elevated privileges So I need to provide user-specific execute pods How many of each kind? HTCondor Week, May 2019 13

Then came the users Everything was working nicely, until we let in real users Well, until we had more than a single user OSG users got used to rely on Containers So they can use any weird software they like But HTCondor Startd already running inside a container! Cannot launch a user-provided container So I need to provide user-specific execute pods How many of each kind? HTCondor Week, May 2019 14

Dealing with many opportunistic pod types Having idle Startd pods not OK anymore A different kind of pod could use that resource A glidein-like setup would solve that Keeping pods without users not OK anymore They will just terminate without ever running a job Who should regulate the glidein pressure ? How do I manage fair share between different pod types? Kubernetes scheduling is basically just priority-FIFO How am I to know what Container images users want? Ideally, HTCondor should have native Kubernetes support HTCondor Week, May 2019 15

Dealing with many opportunistic pod types Having idle Startd pods not OK anymore A different kind of pod could use that resource A glidein-like setup would solve that Keeping pods without users not OK anymore They will just terminate without ever running a job Who should regulate the glidein pressure ? I know how to implement this. How do I manage fair share between different pod types? Kubernetes scheduling is basically just priority-FIFO How am I to know what Container images users want? Ideally, HTCondor should have native Kubernetes support HTCondor Week, May 2019 16

Dealing with many opportunistic pod types Having idle Startd pods not OK anymore A different kind of pod could use that resource A glidein-like setup would solve that Keeping pods without users not OK anymore They will just terminate without ever running a job Who should regulate the glidein pressure ? I was told this is coming. How do I manage fair share between different pod types? Kubernetes scheduling is basically just priority-FIFO How am I to know what Container images users want? Ideally, HTCondor should have native Kubernetes support HTCondor Week, May 2019 17

Dealing with many opportunistic pod types Having idle Startd pods not OK anymore A different kind of pod could use that resource A glidein-like setup would solve that Keeping pods without users not OK anymore They will just terminate without ever running a job Who should regulate the glidein pressure ? How do I manage fair share between different pod types? In OSG-land, glideinWMS solves this for me. Kubernetes scheduling is basically just priority-FIFO How am I to know what Container images users want? Ideally, HTCondor should have native Kubernetes support HTCondor Week, May 2019 18

Dealing with many opportunistic pod types Having idle Startd pods not OK anymore A different kind of pod could use that resource A glidein-like setup would solve that No concrete plans on how to address these yet. Keeping pods without users not OK anymore They will just terminate without ever running a job Who should regulate the glidein pressure ? How do I manage fair share between different pod types? Kubernetes scheduling is basically just priority-FIFO How am I to know what Container images users want? Ideally, HTCondor should have native Kubernetes support HTCondor Week, May 2019 19

Dealing with many opportunistic pod types For now, I just periodically adjust the balance A completely manual process Currently supporting only a few, well behaved users Maybe not optimal, but good enough But looking forward to a more automated future HTCondor Week, May 2019 20

Are side-containers an option? Ideally, I do want to use user-provided, per-job Containers Running HTCondor and user jobs in separate pods not an option due to opportunistic nature But Kubernetes pods are made of several Containers Pod Could I run HTCondor in a dedicated Container? Then start the user pod in a side-container? HTCondor container Pretty sure currently not supported User job container But, at least in principle, fits the architecture Would also need HTCondor native support HTCondor Week, May 2019 21

It has been pointed out to me that latest CentOS supports unprivileged Singularity Will nested Containers be a reality soon? Have not tied it out Probably I should Cannot currently assume all of my nodes have a recent-enough kernel But eventually will get there HTCondor Week, May 2019 22

Looking ahead Looking forward to a more automated future Will do what I have to myself Would be happier if I could use off-the-shelf solutions HTCondor Week, May 2019 23

A final picture Opportunistic GPU usage over the past few months HTCondor Week, May 2019 24

We created an opportunistic HTCondor pool in the PRP Kubernetes cluster OSG users can now use any otherwise-unused cycles The lack of nested containerization forces us to have multiple execute pod types Some micromanagement currently needed, hoping for more automation in the future Summary HTCondor Week, May 2019 25

Acknowledgents This work was partially funded by US National Science Foundation (NSF) awards CNS-1456638, CNS-1730158, ACI-1540112, ACI-1541349, OAC-1826967, OAC 1450871, OAC-1659169 and OAC-1841530. HTCondor Week, May 2019 26

An opportunistic HTCondor pool inside an interactive-friendly Kubernetes cluster

Download Presentation

Presentation Transcript

Related

More Related Content