
IRIS Data Movement and Storage Developments by Alastair Dewhurst
IRIS has invested in various storage endpoints including Grid Storage, File System, S3 Storage at RAL, and Tape Storage at RAL. The storage usage is below capacity despite numerous requests due to the steep learning curve of GridPP storage. The need for globally accessible file systems is discussed, highlighting the challenges in research collaborations with multiple resources. IRIS has invested in digital assets like Rucio, IAM, DIRAC, DynaFed, and FTS to improve Grid services and make data management easier. Rucio, a distributed data management system used by ATLAS and CMS, automates data handling and supports multi-VO instances. Efforts in data movement and orchestration are essential, with tools like FTS facilitating rapid transfers between sites.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Data Movement and Storage Developments Alastair Dewhurst
2 Introduction IRIS has invested in various storage endpoints: Grid Storage (multiple sites) File System Cambridge S3 Storage at RAL Tape Storage at RAL Unlike CPU usage, storage usage is very much below capacity, even though we have many requests. Why is this? GridPP storage works but has a very steep learning curve. Data is persistent so evolution is slower. Sites don t want to lose it. Experiments need to put in effort to migrate to the next thing. ATLAS LHCb CMS ALICE DUNE S3 Gen Alastair Dewhurst, 17th November 2020
3 What do we need? Why can t we just have a globally accessible file system? The semantics of a POSIX file system are not compatible with highly scalable systems. Its expensive and you don t really need it. In a research collaboration there is an added complication of needing to rely on multiple heterogenous resources. At small scale file systems are fine: Few users - manual creation of accounts Meta data - encoded in directory structure, local wiki page. Throughput - single machine bottlenecks are fine. Data resilience - duplicate it. The Grid Evolved for a reason Alastair Dewhurst, 17th November 2020
4 IRIS Data Digital Assets IRIS have invested in several storage digital assets: Improving Grid services. Integrating modern features Making things easier. Lots of related developments that are dependent on each other. Long delivery times but we are starting to see results although I wouldn t call the system alive yet. Rucio IAM DIRAC DynaFed FTS Alastair Dewhurst, 17th November 2020
5 Storage Services Rucio FTS DynaFed SRM / XrootD XrootD Webdav S3 Tape Castor CTA Alastair Dewhurst, 17th November 2020
6 Rucio Rucio is a distributed data management system. Used by ATLAS and CMS as well as others - Long term support. Policy driven - you say what you want, Rucio figures out how. At its core is a database that lists all your data. As well as any metadata you want to record. It also contains information on how to use the various storage endpoints. Many smart daemons automate the work. RAL finished development of a multi-VO Rucio instance in July. Test instance has been running since August supporting SKA and dteam. Aim to enter production first half of 2021. Significant funding from other sources EGI, Swift-HEP to continue development of this. Alastair Dewhurst, 17th November 2020
7 Data Movement Moving lots of data rapidly between sites requires effort. We want to avoid proxying data through machines. Authorization between sites is non-trivial. FTS designed to orchestrate transfers between many sites. Utilized by DIRAC and Rucio Many tools available to upload to S3. FTS PaNOSC using Alastair Dewhurst, 17th November 2020
8 DynaFed Development Active DynaFed digital asset started in September 2020. IAM integration Rucio integration Dynafed provides browser access to S3 storage. The IAM integration is working in test: https://dynafed-test.stfc.ac.uk/gridpp Email feedback to: Sam.glendenning@stfc.ac.uk Alastair Dewhurst, 17th November 2020
9 Global File Systems Multi-Site CephFS[1] CVMFS Multi-Site CephFS Duplicates data across sites. Not obvious how you scale to many sites. CVMFS: Single upload point everywhere else is Read only. Scales to a 10s of TB of data. [1] https://indico.ph.qmul.ac.uk/indico/getFile.py/access?contribId=2&sessionId=2&resId=0&materialId=slides&confId=571 Alastair Dewhurst, 17th November 2020
10 Summary There does not yet exist a straightforward solution for storage. Some VOs are doing it themselves. There are ongoing projects to make the Grid easier to use. There is lots of interest in S3. Alastair Dewhurst, 17th November 2020