Challenges in Harnessing Widely-Distributed Resources: Galaxy, Condor, and Solutions
The challenges faced in leveraging widely-distributed resources using Galaxy and Condor are explored in this content. Issues such as access to data, file system limitations, and application compatibility are discussed. Various solutions, including custom wrappers and the innovative Parrot tool developed at UW-Madison and Notre Dame, are presented to address these challenges and enhance the efficiency of resource utilization for tasks in computational sciences.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Galaxy and Condor: Challenges with harnessing widely- distributed resources Condor Project Computer Sciences Department University of Wisconsin-Madison
Condor Batch scheduler Similar to PBS, SGE Manages a cluster of machines that can run Galaxy tasks Well-suited to widely-distributed systems and sharing resources between groups www.cs.wisc.edu/Condor
James Thomson Lab Stem cell research http://discovery.wisc.edu/home/morgridge/resear ch/regenerative-biology/ Use Galaxy Condor cluster 72 cpus We wrote Galaxy module to run tasks using Condor www.cs.wisc.edu/Condor
Additional Machines Dozen Condor clusters at UW 17,000 cpus Open Science Grid Collaboration of 100 academic institutions 80,000 cpus Amazon EC2 and similar How much do you want to spend? www.cs.wisc.edu/Condor
Problem Access to data No shared file system Condor can transfer files Full list of files not easily available Tasks arguments and input need rewriting Applications Probably not installed www.cs.wisc.edu/Condor
First Solution Write custom wrappers Time-consuming Only suitable for most-used tools Not easily re-usable by other Galaxy users www.cs.wisc.edu/Condor
New Solution Parrot Developed at UW-Madison and Notre Dame http://nd.edu/~ccl/software/parrot/ Transparently intercept all disk I/O Perform I/O on Galaxy machine Use http cache for common input files Reduce I/O for sparse file access www.cs.wisc.edu/Condor