
Distributed Computing Group Meeting Insights
"Explore the status of Storm and Lustre with multi-VO support discussed in the Distributed Computing Group meeting on Oct 23, 2014. Discover test configurations, performance data, and ongoing challenges for the Storm and Lustre setup. Dive into the ILC-DIRAC study for user interface Python code examples and job repository management details." (285 characters)
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Status of Storm+Lustre and Multi-VO Support YAN Tian for Distributed Computing Group Meeting Oct. 23, 2014
StoRM + Lustre: Test Bed SE server configuration: This test machine is originally prepared for dCache+Lustre frontend, thus with good network performance Model Dell PowerEdger R620 CPU Xeon E5-2609 v2 @ 2.50 GHz CPU Cores 8 cores Memory 64 GB HDD scsi 300 GB Network 1 eth0: 1 Gbps Network 2 eth4: 10 Gbps A symblic link to Lustre directory , User can access files in this directory through StoRM webDAV portal
StoRM + Lustre Test 1: single thread download Test time: Oct 15, 17:50--18:40 Lustre is not busy (load 7%, out 80 MB/s) 20 files of size 1 GB Average download speed: 10.6 MB/s with eth0: 1 Gbps load of SE: 0.8~1.1 load, 11~13% wa v.s. When Lustre is busy: out 500~1400 MB/s
StoRM + Lustre Test 2: multi threads/processes download Multi-tread download tool: mytget, can t start multi-thread mode for Lustre Multi-process wget download, do not improve much, 22~33 M/s 4 processes 8 processes
StoRM + Lustre Test 3: Symbolic Link Problem Modify namespace.xml is under trying.
Storm+Lustre Test: To Do Solve symblik link problem Dataset transfer test between IHEPD-USER Open ports 50000:55000 Dataset transfer test between WHU/USTC-USER
ILC-DIRAC Study: User Interface Python code, which can be directly execute A job script example: from DIRAC.Core.Base import Script Script.parseCommandLine() from ILCDIRAC.Interfaces.API.DiracILC import DiracILC dirac = DiracILC(True, my_job_repository.rep") from ILCDIRAC.Interfaces.API.NewInterface.UserJob import UserJob job = UserJob() job.setName("MyJobName") job.setJobGroup("Agroup") job.setCPUTime(86400) from ILCDIRAC.Interface.API.NewInterface.Application import Mokka, Marlin mo = Mokka() mo.setLogFile( sim-job.log ) mo.setInputFile( init.macro ) mo.setOutputFile( E250-CDR_wo_Pnnh.eL.eR.001.slcio ) mo.setNumberOfEvents(1000) job.append(mo) mar = Marlin() mar.setParameters( value ) mar.getInputFromApp(mo) job.append(mar) job.submit(dirac) dirac instance is job reciever define and set para. for app. applications stack
ILC-DIRAC Study: Job Repository Repo. Contains all necessary information of jobs for Job Monitoring $ dirac-repo-monitor repo.cfg for Retrive all the output sandbox and output data $ dirac-repo-retrieve-jobs-output r O repo.cfg Repository is a functionality provided by DIRAC call 3 methods
ILC-DIRAC Study: Applications Many applications Generation: Whizard, Pythia, StdHepCut Simulation: Mokka, SLIC Reconstruction: Marlin, LCSIM, SLICPandora Analysis: Marlin, ROOT, Druid, etc A command for user to querry avaliable app. and it s version $ dirac-ilc-show-software Applications are all defined in module ILCDIRAC.Interfaces.API.NewInterface.Application (base class) ILCDIRAC.Interfaces.API.NewInterface.Applications In job script: from ILCDIRAC.Interface.API.NewInterface.Application import Mokka mo = Mokka() mo.setParameters1( value1 ) mo.setParameters2( value2 ) job.append(mo) A Generic application for executable outside ILCsoft, e.g. ga = GenericApplication() ga.setScript( boss.exe ) ga.setArguments( jobOptions.txt )
ILC-DIRAC Study: User Input Data For ILC analysis jobs, users always need his own lib. file *.so ILC solution: upload to SE, download to WN $ tar czf lib.tar.gz lib/ $ dirac-dms-add-files /ilc/user/i/initial/some/path/lib.tar.gz lib.tar.gz CERN-SRM $ dirac-dms-remove-files /ilc/user/i/initial/some/path/lib.tar.gz In Job Script: job.setInputSandbox("LFN:/ilc/user/i/initial/some/path/lib.tar.gz") ILC allow user to use $ dirac-dms-filecatalog
ILC-DIRAC Study: Class Inheritance DIRAC classes ILC-DIRAC classes Dirac DiracILC Spliter? UserJob Job ProductionJob Application Applications MokkaAnalysi s ModuleBase MarlinAnalysis etc PhthiaAnalysis
ILC-DIRAC: Module Example (MokkaAnalysis) called by Job Agent from ILCDIRAC.Workflow.Modules import MokkaAnalysis ma = MokkaAnalysis() ma.execute() In this module: 1. retrieve job parameters 2. write a shell script to a) set environment b) run application c) return status code