Challenges in Harnessing Widely-Distributed Resources: Galaxy, Condor, and Solutions

Challenges in Harnessing Widely-Distributed Resources: Galaxy, Condor, and Solutions
Slide Note
Embed
Share

The challenges faced in leveraging widely-distributed resources using Galaxy and Condor are explored in this content. Issues such as access to data, file system limitations, and application compatibility are discussed. Various solutions, including custom wrappers and the innovative Parrot tool developed at UW-Madison and Notre Dame, are presented to address these challenges and enhance the efficiency of resource utilization for tasks in computational sciences.

  • Distributed resources
  • Galaxy
  • Condor
  • Solutions
  • Computational sciences

Uploaded on Mar 01, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Galaxy and Condor: Challenges with harnessing widely- distributed resources Condor Project Computer Sciences Department University of Wisconsin-Madison

  2. Condor Batch scheduler Similar to PBS, SGE Manages a cluster of machines that can run Galaxy tasks Well-suited to widely-distributed systems and sharing resources between groups www.cs.wisc.edu/Condor

  3. James Thomson Lab Stem cell research http://discovery.wisc.edu/home/morgridge/resear ch/regenerative-biology/ Use Galaxy Condor cluster 72 cpus We wrote Galaxy module to run tasks using Condor www.cs.wisc.edu/Condor

  4. Additional Machines Dozen Condor clusters at UW 17,000 cpus Open Science Grid Collaboration of 100 academic institutions 80,000 cpus Amazon EC2 and similar How much do you want to spend? www.cs.wisc.edu/Condor

  5. Problem Access to data No shared file system Condor can transfer files Full list of files not easily available Tasks arguments and input need rewriting Applications Probably not installed www.cs.wisc.edu/Condor

  6. First Solution Write custom wrappers Time-consuming Only suitable for most-used tools Not easily re-usable by other Galaxy users www.cs.wisc.edu/Condor

  7. New Solution Parrot Developed at UW-Madison and Notre Dame http://nd.edu/~ccl/software/parrot/ Transparently intercept all disk I/O Perform I/O on Galaxy machine Use http cache for common input files Reduce I/O for sparse file access www.cs.wisc.edu/Condor

Related


More Related Content