
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
"Exploring the execution of complex scientific applications on hybrid cloud infrastructures through research challenges and the Atmosphere Framework, focusing on workflows, provisioning, and optimization."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures Maciej Malawski1,2, Piotr Nowakowski1, Tomasz Guba a1, Marek Kasztelnik1, Marian Bubak1,2, Rafael Ferreira da Silva3, Ewa Deelman3, Jarek Nabrzyski4 NSFCloudWorkshop on Experimental Support for Cloud Computing December 11-12, 2014, Arlington, VA AGH University of Science and Technology: 1 ACC Cyfronet AGH, ul. Nawojki 11, 30-950 Krak w, Poland 2 Department of Computer Science, al. Mickiewicza 30, 30-095 Krak w, Poland 3 University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA 4 Center for Research Computing, University of Notre Dame, IN, USA
Research Challenges Execution of complex scientific applications on clouds: workflows and their ensembles Pegasus Workflow Management System (OCI SI2-SSI #1148515) HyperFlow Workflow Engine Platform for deployment and sharing of scientific applications on hybrid clouds Atmosphere Framework Algorithms for scheduling, provisioning and cost optimization: Dynamic and Static Algorithms Mathematical Programming Cloud Workflow Simulator 2 2
Research: The Atmosphere Framework Hybrid cloud as a means of provisioning computing power for virtual experiments 96 CPU cores GUI host (provisions end-user features and access options) 184 GB RAM Head Node Worker Node Worker Node Worker Node Cloud Management Portlets Provide GUI elements which enable service developers and end users to interact with the Atmosphere platform and create/deploy services on the available cloud resources 4 TB storage private IP space Image store OpenStack cloud site at ACC CYFRONET AGH Atmosphere Core Services Host Secure RESTful API (Cloud Facade) 128 CPU cores Worker node w/large resource pool ( fat node ) 256 GB RAM Atmosphere Core Head Node Authentication and authorization logic Communication with underlying computational clouds Launching and monitoring service instances Creating new service templates Billing and accounting Logging and administrative services Worker node w/large resource pool ( fat node ) 4 TB storage private IP space Image store VPH-Share cloud site at UNIVIE Massive (functionally limitless) hardware resource pool Atmosphere Registry (AIR) Worker Node API host user accounts public IP space Worker Node Image store available cloud sites services and templates Amazon Elastic Compute Cloud (EC2) European availability zone 3 3
Research: Simulation and Scheduling of Large-Scale Scientific Workflows on IaaS Clouds Large-scale scientific workflows from Pegasus WMS Workflows of 100,000 tasks Workflow Ensembles Schedule as many workflows as possible within a budget and deadline Uses a Cloud Workflow Simulator VM Time M. Malawski, G. Juve, E. Deelman, J. Nabrzyski: Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. SC 2012: 22 4 4
Research: Cost Optimization of Applications on Clouds Task Infrastructure model Multiple compute and storage clouds Heterogeneous instance types Application model Bag of tasks Multi-level workflows Modeling with AMPL and CMPL Modeling Language for Mathematical Programming Cost optimization Under deadline constraints Mixed integer programming Bonmin, Cplex solvers Input Layer 1 A 1h private Application Compute Output Private cloud B B C Layer 2 2.5 h B 6 h Layer 3 D 0.5 h rs.1gb rs.2gb m1.small m1.large Storage Storage rs.4gb rs.16gb t1.micro m2.xlarge Layer 4 E 0.3 h Compute Compute Rackspace Amazon Layer 5 F 2 h 20000 tasks, 512 MiB input and 512 MiB output, task execution time 0.1h @ 1ccu machine 3000 Amazon S3 Rackspace Cloud Files Optimal 2500 Multiple providers 2000 Cost ($) 1500 Amazon's and private instances 1000 Rackspace and private instances Rackspace instances 500 0 0 10 20 30 40 50 60 70 80 90 100 Time limit (hours) M. Malawski, K. Figiela, J. Nabrzyski, Cost minimization for computational applications on hybrid cloud infrastructures, Future Generation Computer Systems, 29(7), 2013, pp.1786-1794, http://dx.doi.org/10.1016/j.future.2013.01.004 M. Malawski, K. Figiela, M. Bubak, E. Deelman, J. Nabrzyski, Cost Optimization of Execution of Multi-level Deadline- Constrained Scientific Workflows on Clouds. PPAM, 2013, 251-260 http://dx.doi.org/10.1007/978-3-642-55224-3_24 5 5
Research: Cloud Performance Evaluation Performance of VM deployment times Virtualization overhead Evaluation of open source cloud stacks Eucalyptus, OpenNebula, OpenStack Survey of European public cloud providers Performance evaluation of top cloud providers EC2, RackSpace, SoftLayer A grant from Amazon has been obtained Per- hour instance billing 5 1 1 1 1 1 1 1 1 0 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 1 0 0 1 1 0 1 1 1 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 1 VM Image Import / Export 3 0 0 0 1 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 1 0 0 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 jClouds API Support 20 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 BLOB storage support 10 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 Relational DB support 2 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 EEA Zoning 20 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 1 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 1 1 1 API Access 5 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 0 1 0 0 1 1 1 0 1 1 0 0 1 1 1 1 0 1 1 1 0 0 0 1 0 1 1 1 1 1 Published price 5 1 1 1 1 1 1 1 0 1 0 0 0 1 1 0 0 1 0 1 0 0 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1 IaaS Provider Weight Amazon AWS Rackspace SoftLayer CloudSigma ElasticHosts Serverlove GoGrid Terremark ecloud RimuHosting Stratogen Bluelock Fujitsu GCP BitRefinery BrightBox BT Global Services Carpathia Hosting City Cloud Claris Networks Codero CSC Datapipe e24cloud eApps FlexiScale Google GCE Green House Data Hosting.com HP Cloud IBM SmartCloud IIJ GIO iland cloud Internap Joyent LunaCloud Oktawave Openhosting.co.uk Openhosting.com OpSource ProfitBricks Qube ReliaCloud SaavisDirect SkaliCloud Teklinks Terremark vcloud Tier 3 Umbee VPS.net Windows Azure Score 27 27 25 18 18 18 15 13 12 8 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 M. Bubak, M. Kasztelnik, M. Malawski, J. Meizner, P. Nowakowski, S. Varma, Evaluation of Cloud Providers for VPH Applications, poster at CCGrid2013, Delft, the Netherlands, pp.13-16, 2013 47 48 49 6 6
Experiment: Evaluation of autoscaling techniques for Atmosphere cloud platform Challenges Requires repeated tests under varying workloads Experiments in an isolated environment Goals Perform autoscaling based on: Complex event processing Time series database Build an isolated environment on NSFCloud 7 7
Experiment: Scalability of Scientific Workflows in HyperFlow Model Challenges Issues on data transfers and data locality Calibrate the performance models of applications Goals Execute large-scale deployments on multi-site NSFCloud facilities Assess the impact of network latency and bandwidth limitations PaaSage platform Upperware Executionware PaaSage application Metrics Metrics Deploy & scale infrastructure Workflow CAMEL generator CAMEL model Cloud RabbitMQ Monitoring VMs Workflow generator Workflow graph Hyperflow engine Task Ready tasks Workflow Scheduled tasks Job queue scheduler Executor Executor 1 components Executor 1 Results Redis 8 8
Experiment: Influence of Variability of Clouds on the Quality of Algorithms Challenges Static scheduling methods assume that the estimates of task runtimes are available The runtime variations and various uncertainties influence the actual execution 2.0 DPDS WADPDS SPSS 1.5 Makespan / Deadline 1.0 Goals 0.5 A large-scale experimental testbed will allow investigating the influence of the uncertainties 0.0 DPDS WADPDS SPSS DPDS WADPDS SPSS DPDS WADPDS SPSS DPDS WADPDS SPSS DPDS WADPDS SPSS DPDS WADPDS SPSS DPDS WADPDS SPSS 0 % 1 % 2 % Runtime estimate error 5 % 10 % 20 % 50 % Development of new models to mitigate uncertainties negative effects 9 9
Experiment: Interoperation of Cloud Testbed of PL-Grid Infrastructure with NSFCloud PL-Grid One of the largest national grid infrastructures in Europe (2500+ users, 500+ teams) Cloud testbed based on OpenNebula and OpenStack Goals Possibility to run transatlantic and global-scale experiments Evaluation of impact of wide-area and high-latency networks 10 10
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures Thank you. DICE Team at AGH: http://dice.cyfronet.pl Center for Research Computing at Notre Dame: https://crc.nd.edu Pegasus Team at USC: http://pegasus.isi.edu