Application of Container Technology in LHD Analysis System
This study presents the implementation of Docker, a container technology, in the LHD Analysis System for automatic analysis of physical data. The AutoAna system, execution process, and advantages of parallel processing are discussed with visual illustrations. The use of multiple processes across PCs enhances computational performance.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
APPLICATION OF CONTAINER TECHNOLOGY IN LHD ANALYSIS SYSTEM 1 1)M.Emoto, 1)K.Ida, 1)S.Imazu, and 1)M.Yoshida 1)National Institute for Fusion Science, 509-5292 Japan
OUTLINE 1. The AutoAna, Automatic Analysis System for physical data 2. Problems in parallel execution 3. Adoption of Docker, a container technology 4. Future Work and Summary 2
OVERVIEW of the AutoAna Notification of Registration UPDATER DATA Analysis Program Execution EXECUTER Registration DATABASE SERVER Calculation Request EXECUTER SERVER Name No Role UPDATER 1 Receives data notification packets from the Analyzed Server, and request SERVER to create physical data that requires the registered data as input EXECUTER SERVER 1 Maintain the job requests. EXECUTER 160 ~ Get a job request from SERVER to executer an analysis programs. 3
Manually Registered Data A D Automatic Registered Data B E C This figure shows how the AutoAna executes the analysis programs. Analyzed data D is calculated from A and B. E is calculated from C and D. A and C are registered, but other 3 data are not registered yet. 4
Manually Registered Data A D Automatic Registered Data B E C When B is registered into the Database Server, the server notify the AutoAna of data registratioin, and the AutoAna begins a job to create D. 5
Manually Registered Data A D Automatic Registered Data B E C As before, when D is registered into the Database server, the AutoAna begins to calculate E. 6
Manually Registered Data A D Automatic Registered Data B E C Thus, the AutoAna runs analysis programs automatically and maintain the dependencies among the data. 7
EXECUTION OF MULTIPLE PROCESSES BY MULTIPLE PC In order to gain computational performance, multiple EXECUTERs are working simultaneously. Currently, over 160 EXECUTERs are running on 11 PC. 219 analysis programs written in various computer languages are provided by researches . Each EXECUTER picks a job request from SERVER in first-in first-out manner, runs analysis program equally. By this technique, key data of the experiment can be available during 3-minute- cycle of short pulse experiment. 8
EXECUTION STATUS No. | Job No. | Diagnostics | Shot No | Start | Host ---------------------------------------------------------------------------------- -------- 1 | 135431 | vmec | 134137 | 06/29 16:04:24 | egcalc8 2 | 135432 | newboz | 135648 | 06/29 16:04:25 | egcalc8 3 | 135433 | fit3d_1MW_autoana | 169180 | 06/29 16:04:25 | egcalc20 4 | 135434 | vmec | 134138 | 06/29 16:04:26 | egcalc20 5 | 135435 | newboz | 135649 | 06/29 16:04:26 | egcalc7 6 | 135436 | fit3d_1MW_autoana | 169181 | 06/29 16:04:27 | egcalc11 7 | 135437 | vmec | 134139 | 06/29 16:04:27 | egcalc11 8 | 135438 | newboz | 135650 | 06/29 16:04:28 | egcalc20 9 | 135439 | fit3d_1MW_autoana | 169182 | 06/29 16:04:28 | egcalc17 10 | 135440 | vmec | 134140 | 06/29 16:04:28 | egcalc8 157 | 135706 | fit3d_1MW_autoana | 169271 | 06/29 16:06:25 | egcalc18 158 | 135707 | vmec | 134229 | 06/29 16:06:26 | egcalc8 159 | 135708 | newboz | 135740 | 06/29 16:06:26 | egcalc18 160 | 135709 | fit3d_1MW_autoana | 169272 | 06/29 16:06:27 | egcalc7 9
OPERATIONAL PROBLEM IN PARALLEL EXECUTION The analysis programs are provided by the researchers. They developed their programs to run exclusively on their computers for themselves. Therefore, it causes problems in parallel execution environment. multiple programs read and write same files to cause conflict. 1. temporary files remain after the program finishes, and they waste disk space. 2. Each program uses a different runtime environment, and it is difficult to maintain these environments for a long time. 3. 10
EXECUTION IN CONTAINER Adopting Docker, a container technology, the problems occurred in simultaneous execution can be solved as follows, temporary files are created in container to avoid conflict 1. all files created during execution can be removed at the end of execution 2. container has its own execution environment, and the environment is isolated. 3. 11
CONS AND PROS OF DOCKER CONTAINERIZATION Pros Lightweight, fast Small size Runtime environment is included into container AP AP AP OS OS OS VM VM VM AP AP AP Cons Container shares OS, and affected by the host OS vulnerability. Container works on Linux well. It support Windows and MacOS, but they use virtual machine. Virtualization Software Container Engine HOST OS HOST OS Hosted Virtual Machine Container 12
tsdnn uses old libraries and it cannot run in the current environment. LAYER STRUCTURE Docker uses layered file system (overlay FS). Images can share layers to save disk space. AP (tsddn) Tensor Flow 1.6 AP (thomson_fit) Tensor Flow 1.13 Tensor Flow 1.13 AP Python3 Python3 (shell application) Library Library Library Cent OS Cent OS Cent OS BASIC Analysis Programs DNN programs using TensorFlow 1.13 DNN programs using TensorFlow 1.6 (older than 1.13) 13
EXECUTION OF DOCKER IMAGE MANUALLY Docker application is portable, and it is easy to run analysis programs without installing software or libraries, for example, the user can just type docker run CONTAINER-NAME SHOT# to register analyzed data using his own PC. % docker pull eghome.lhd.nifs.ac.jp:5000/tsdnn % docker run eghome.lhd.nifs.ac.jp:5000/tsdnn 144103 Docker application can run on Windows, MacOS as well as Linux. It is convenient to port the whole calculation environment into other systems. 14
CONFIGURATION FILE OF AUTOANA "tsdnn ": { "module": "tsdnn", "command": "docker run --rm --cpus=1 -e KSERVER_HOST=192.168.7.102 tsdnn %d", "depend": [ "tsdnn_weights", "thomson", "ip" ] ] }, 15
FUTURE WORK The author plan to improve the AutoAna taking advantage of docker s features. Implement smarter job control Change the architecture Client Pull to Server Push run specific program on suitable PC Network distributed system (using Kubernets?) Dynamically change the number of clients Implement dynamic resource (CPU, network, etc.) allocation 16
SUMMARY Analysis programs used by the AutoAna is migrated to Docker environment, a container technology. Docker resolved the problem occurred when multiple processes run in the same PC simultaneously. Docker enables the AutoAna to allocate CPU resources flexibly. Docker facilitates porting systems of a project to another. 17