JMD Requirements and Considerations in Building a Reliable System
The requirements and considerations for building a robust Java/JSOC Mirroring Daemon (JMD) system, focusing on making it standalone and resilient. Discuss the architecture, software used, peer and authoritative nodes, Slony replication, state machine, and handling various system states and errors."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
JMD (Java/JSOC? Mirroring Daemon) Authors: Igor Su rez Sol Alisdair Davey Joe Hourcl VSO Team
JMD requirements JMD structure Current issues Wish list Alternatives to the JMD Conclusion
Considerations building the JMD Make it as stand alone as possible. Use DRMS/SUMS interfaces but not DRMS system resources. E.g. Don t want a JMD behaving badly to affect DRMS operations. Make it as resilient as possible. Should be retriable Should be self-contained
Software used Jetty webserver 6.1 Derby embedded db. 10.5.3 Java 1.6 Development with Eclipse IDE Wiki page http://vso.tuc.noao.edu/VSO/w
Peer Node Authoritative Node SLONY REPLICATION Data scp-ing Jsoc_fetch query Your node
Slony replication Triggers in replicated table series write sunum,recnum,series_name to the sunum_queue table. JMD reads from queue table the sunums to process and deletes them as soon as they are written to the embedded db.
NEW SU YES Retry after N minutes SU DONE local? NO FAIL YES Allocate SU in SUMS SU in peer nodes? PENDING NO scp SU from peer node NO SU NO SU in INVALID SUMPUT auth nodes? scp SU YES From AUTH node YES NO Allocate SU in SUMS Is JMD Tier2? PENDING YES
State machine NQUE INVD REDY DONE BQUED PNDG QUED FAIL
NQUE: (Non-Queued) SUs from sunum_queued table REDY: (Ready) Peer nodes already queried BQUED: (Being Queued) SUs are queued into thread manager priority queue QUED: (Queued) SU being handle by thread DONE: (Final state), SU in SUMs FAIL: (Retriable state) Some error encountered. PNDG: (Pending) Tier2 nodes. Waiting for SUs to be mirrored upstream INVD: (Invalid): SU not found in Authoritative node.
Issues I know of Is Invalid a final state? Request a new sunum It would be nice to distinguish between a non existing sunum and a sunum that has timeout. JMD stops writing to log file. Jetty embedded database issues: Size Sometimes network connection drops temporarily (?) Memory leak in sums_svc. JMD driven?
TODO/Wish list Integrate JMD development in NetDRMS. Add reprocessing interface request to Authoritative Node. Send jsoc_fetch requests according to series. E.g. hmi to MPS and AIA to SAO Add a pool of threads just for user requests. Make the table sunum_queue a parameter of JMD.cfg Implement a more bittorrent-like architecture in the JMD Group SUs so scp download is more efficient GUI front end to the JMD? Use HTTP or FTP to allow pulling from NASA. Show coverage of SUs (Elie Soubri ) Other requests?
Some Stats 90K to 200K files daily per site 400GB to 2TB daily per site NSO mirroring stats: Since June 17th 8 Million SUs 65TB
Do you need an alternative? Buying 3rd party software GDMP (Grid Data Mirroring Package) Other packages? Rewrite?