Adri Priadana Weekly Report on Human Action Recognition and Korean Conversation Course Activities

1 / 11

Embed Share

"Adri Priadana's weekly report outlines recent activities in Human Action Recognition research and Korean Conversation Language course, focusing on model experiments, code implementations, and upcoming class schedules. The report covers challenges faced, strategies applied, and results achieved in fine-tuning models for improved accuracy and target performance. Additionally, insights from the NTU60 dataset and frame sampling techniques are discussed. Stay updated with Adri's ongoing efforts in refining models and exploring new ideas."

tile840 Follow

Uploaded on Jun 22, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Weekly Report Adri Priadana November 7, 2023

Last Week Activities Course (Korean Conversation Language) Doing the course TII Journal Face Recognition Status: Awaiting Review Scores 2

Last Week Activities TII Journal Human Action Recognition Tried to apply a transformer encoder in all stages Code and applied Parallel (Split) of Spatial B T H W C & Temporal B H W T C Self- Attention in the 1st& 2ndstages and then Self-Attention B T H W C in the last stage The result is 79.46 Tried to apply CNN in the 1st& 2ndstages and then Parallel (Split) Spatial B T H W C & Temporal B H W T C Self-Attention in the last stage The result is better, 82.72 Not enough to outperform the target, 86.60. Tried to apply some data augmentation strategies. Some libraries require higher version. Tried to upgrade these libraries but appeared errors on the framework. Tried to fix it but it cannot be fixed. Installed new environment and installed the framework from the beginning. 3

This Week Activities Course (Korean Conversation Language) Doing the course There is additional class on Saturday, November 11, 2023, at 2-5 p.m. TII Journal Human Action Recognition Continue to look for new ideas and fine tune the model 4

Thank You 5

Report on October 31, 2023 TII Journal Human Action Recognition Code and built new model, did not modify the model from the framework that used MMCV, so easier to customize Firstly, faced many errors and then finally successful (helped by Thuy) Tried to apply a transformer encoder in all stages The result is not good enough, 77.69 for accuracy. Tried to apply CNN in first and second stages and then a transformer encoder on the third stage, then finetune The result is better, 82.50 for the accuracy, but not enough to outperform the target, 86.60 for accuracy. 6

Last Week Activities Human Action Recognition TII Journal Use different number of frame Number of Frames Original NTU60 dataset (40K to Train) NTU60 dataset (5K to Train) Param (M) Model GFLOPs Paper 48 2.0 15.9 93.70 48 2.0 15.9 93.06 86.03 32 93.10 2.0 10.56 86.60 Our Train 24 2.0 7.92 86.16 36 2.0 11.88 86.43 Regarding the paper, the uniform sample ? frames took from a video by dividing the video into ? segments of equal length and randomly select one frame from each segment. 7

Last Week Activities TII Journal Human Action Recognition Continue to fine tune the model and look for new ideas Param (M) Original NTU60 (40K to Train) NTU60 (5K to Train) Baseline Model GFLOPs Stem 1 72,32 93.70 Paper 2.0 15.9 1 12,32 1 32,32 1 12,128 Paper (Our Train) 2.0 15.9 93.06 86.03 Stg1 4 Add TA (Effi) >> Base1 2.0 15.9 86.83 93.15 Add TA (Effi) 2ndTry 2.0 15.9 86.27 3 12,64 1 32,64 1 12,256 1.8 14.9 86.23 Base1 + C1 Sp3&5 (Stg 2&3) >> Base2 Stg2 6 93.01 86.85 1.8 14.8 Base2 + C2 Sp3&5 (Stg 2&3) >> Base3 Base2 + C2 Sp3&5 Stg2 + C2 Sp3&5&Tf&Iden (0.5,0.25,0.125,0.125) Stg3 Base2 + C2 Sp3&5 Stg2 + C2 MHSA (B*TxCxH*W) & FFN Stg3 1.6 14.6 86.77 93.10 3 12,128 1 32,128 1 12,512 Stg3 3 1.67 14.7 84.20 93.10 2.0 10.6 86.60 F32 - Paper (Our Train) GA, FC F32 - Base2 + C2 Sp3&5 on Stg2 + C2 Sp 3&5&MHSA (B*TxCxH*W) FFN&Iden (0.5,0.25,0.125,0,125) on Stg3 F32 - Base2 + C2 Sp3&5 on Stg2 + C2 Sp 3&5&MHSA (BxCxT*H*W) FFN&Iden (0.5,0.25,0.125,0,125) on Stg3 F32 - Base2 + C2 Sp3&5 on Stg2 + C2 Sp 3&MHSA (BxCxT*H*W) FFN&Iden (0.75,0.125,0,125) on Stg3 92.71 87.18 1.6 9.8 87.07 92.89 1.6 9.8 92.78 87.27 1.7 9.8 8

Last Week Activities Human Action Recognition TII Journal Try to fine-tune the model based on Spatial-Temporal data Param (M) Original NTU60 (40K to Train) NTU60 (5K to Train) Model GFLOPs Baseline Paper 2.0 15.9 93.70 Stem 1 72,32 Paper (Our Train) 2.0 15.9 93.06 86.03 1 12,32 1 32,32 1 12,128 Add TA (Effi) >> Base1 2.0 15.9 86.83 93.15 Stg1 4 Add TA (Effi) 2ndTry 2.0 15.9 86.27 OEM Base1 + MHSA(H8) last of Stg 2&3 Base1 + MHSA(H8) last of Stg 2 Base1 + MHSA(H8) last of Stg 3 3.3 2.3 3.1 1.8 19.5 18.3 17.1 14.9 3 12,64 1 32,64 1 12,256 OEM Stg2 6 OEM 86.23 3 12,128 1 32,128 1 12,512 Base1 + C1 Sp3&5 (Stg 2&3) >> Base2 Stg3 3 86.85 93.01 1.8 14.8 Base2 + C2 Sp3&5 (Stg 2&3) >> Base3 Base2 + C2 Sp3&5 Stg2 + C2 Sp3&5&Tf&Iden (0.5,0.25,0.125,0.125) Stg3 Base2 + C2 Sp3&5 Stg2 + C2 MHSA (B*TxCxH*W) & FFN Stg3 14.6 1.63 86.77 93.10 GA, FC 1.67 14.67 84.20 TA : 1 ? 1 1, Effi : only GA and Sigmoid, placed after every ResNet block 9

Human Action Recognition References: Revisiting Skeleton-Based Action Recognition 10

Human Action Recognition Original NTU60 dataset (40K to Train) NTU60 dataset (20K to Train) NTU60 dataset (10K to Train) NTU60 dataset (5K to Train) Model Paper 93.70 Paper (Our Train) 93.06 92.04 89.99 86.03 Add SE 85.52 Add SA 85.61 Add TA 86.07 Add TA (Effi) >> Base1 93.15 86.83 Add TA (Effi) 2ndTry 86.27 Base1 + Using SiLU 85.73 11

Adri Priadana Weekly Report on Human Action Recognition and Korean Conversation Course Activities

Download Presentation

Presentation Transcript

Related

More Related Content