Design Your E-Infrastructure for Plant Phenotyping Data Management

design your e infrastructure n.w
1 / 14
Embed
Share

Explore the design, implementation plan, and user insights for an e-infrastructure supporting plant biologists in managing and analyzing phenotyping data across European facilities. Learn about user characteristics, value delivery, system usage, and development timeline.

  • e-infrastructure
  • plant phenotyping
  • data management
  • user insights
  • development timeline

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Design your e-infrastructure! https://indico.egi.eu/indico/event/4434/ Use case: EMPHASIS Break out group coordinator: Baptiste Grenier Amsterdam, 9. May, 2019.

  2. Group members Mark Vincent Diego Baptiste

  3. First break-out Background and Users

  4. Who will be the user? Can the users be characterised? How many are they? Plant biologists, ecophysiology European Plant Phenotyping network 22 partners, 31 facilities Potentially 1000 to 2000 users 10 to 100 per site Some early adopters 3 early adopters INRA Jullich Waggenigen

  5. What value will the envisaged system deliver for them (the whole setup)? What will the system exactly deliver to them? An integrated and federated data management and processing solution For plant phenotyping One deployment per facility Integrated (on the long term) Allowing users to access data from other sites In the future few years , but first only locally Independent datasets per facility Common subset/data type that can be extended by sites

  6. How should they use the system? Federated AAI Authentication Access policies / Authorization User selects data or download it using API On the long term a Science Gateway/portal integrating data and tools User runs analysis on a computing infra Galaxy, Jupyter (able to retrieve data directly) Arbitrary computing facility Management (User) is done through a web/unified interface

  7. What's the timeline for development, testing and large-scale operation? (Consecutive releases can/should be considered.) Piloting activities with 3 early adopters Starting from now To validate solution to be implemented by 2021

  8. Second break-out Design and implementation plan

  9. What should the first version include? - The most basic product prototype imaginable already bringing value to the users (the so-called Minimal Viable Product - MVP) Importing data From acquisition to storage Storage to computing Managing rights Access to computing platform A basic Science Gateway integrating (Federated?) Access (Discovery of data?) Running Galaxy-based workflows Integrating existing tools as Galaxy/Jupyter workflows

  10. Which components/services already exist in this architecture? Low level web services allowing to send, share, access the data Local management of users and groups / access Some local and limited computing resources Basic Web interface Local user and group management Download of small quantity of data Local analysis tools deployed on computing resources Python for image analysis R for numeric analysis (more on the scientist machine)

  11. Which components/services are under development (and by who)? Connection between iRODS and PHIS Software Dev Group in charge of IS Porting analysis tools to Galaxy/Jupyter Computing Scientists Existing web interface complemented with a complete Science Gateway

  12. Which components/services should be still brought into the system? Which EGI (or other) partner can do it? Cloud services (cloud/container) Deploy the IS To do analysis To host metadata (PID, ) in MongoDB, RDF Storage services (integrating with Computing) To store raw data (images) (B2SAFE, DataHub) To store processed data (B2SAFE, DataHub, Data Transfer) Archival of data (B2SAFE) Distributed/Federated file system (B2SAFE, DataHub) AAI (Check-in, B2ACCESS) PID Management (B2HANDLE) Science Gateway (Notebooks, AoD)

  13. Are there gaps in the service catalogues that should be filled to implement the use case? Which service provider could fill the gap? Science Gateway built on the existing web services

  14. Next steps 1. Agreeing on who to involve and regular meetings All - middle May Designing (what, how, who, how much) pilot projects Middle June 1. Computing-related pilots (EGI, EMPHASIS) 2. Storage-related pilots (EGI, EUDAT, EMPHASIS) 3. Fully integrated pilot (EGI, EUDAT, EMPHASIS) 4. Report to technical support activities in EOSC-hub (EGI, EUDAT) 5. Apply to Early Adopter programme https://www.eosc-hub.eu/news/eosc-hub-launches-early-adopter-programme Conducting pilots (EGI, EUDAT, EMPHASIS) 1. By September? Evaluation / validation By October? Agreements, to production - later 2. 3. 4. 5.

More Related Content