
Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems
Explore the use of deep reinforcement learning for fully autonomous hybrid beamforming in mmWave massive MIMO systems, addressing non-convexity and non-linearity challenges. The approach involves interaction between agents and environments to update neural network parameters, optimizing system performance. Simulation results demonstrate effective convergence towards optimal beamforming solutions.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Fully Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems Using Deep Reinforcement Learning Yunseong Cho (yscho@utexas.edu) The University of Texas at Austin Wireless Networking & Communications Group Dec. 3rd, 2021 1
Introduction Problem Want to design hybrid beamforming architecture. Hybrid beamforming architecture requires both digital and analog components Analog components introduce non-convexity and non-linearity Solution Use reinforcement learning, a goal-oriented approach Interaction b/w agent and environment updates neural net parameters Agent at ?? Action ??= ? ?? Reward ?? Observation ??+1 Update policy ? to maximize ?? Environment (mmWave channel) 2 Fully Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems Using Deep Reinforcement Learning
?: antenna spacing ?: wavelength ?: ray angle ??: # of paths ??,??: AoA and AoD System Model System Model (MIMO uplink system) Uniform linear array (ULA) as steering vector for analog components o ? ?,? = 1,?2? ?/? sin ?, ,?2? ? 1 ?/? sin ?? Discrete rays propagation model, i.e., geometric channel, which is line-of- sight channel o ? = ?=1 State UE locations WBB,1 WRF,1 ? ????? ??,??? ??,?? Currently achieved SINR RF chain + Goal (reward in RL) ?1 FBB FRF Baseband Combiner ?? ?RF,? ?? Maximize achieved SINR ??? + RF chain RF chain + Design Parameters (action) ?1 RF chain ?? ??? and ???: Matrix design WBB,?? Baseband Precoder WRF,?? ???s ?? ?RF,? ???: ???,? angles (AoDs) RF chain ????? + RF chain + ?1 ???,?: ???,? angles (AoAs) ?? ?RF,? Baseband Combiner ?? ??? + RF chain ??RF ??RF ?? ???RF?BB? + ?BB ? = ?BB = ?BB Not a convex problem ??RF ?? because of unit-modular constraints of ??? and ??? Fully Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems Using Deep Reinforcement Learning 3
BACKGROUND Update Critic to minimize ? ??,?? ??+ ? ? ??+1,? ??+1 2 ? ??+1,? ? ??,?? Update Actor to maximize expected return Critic_target Critic Reward ?? from env. each iteration repeats: Action ? ??+1 Action ?? Actor_target Actor State ?? State ??+1 Soft update of network parameters ???????= ????????+ 1 ? ? Fully Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems Using Deep Reinforcement Learning 4
Simulation Results RF chain + ?1 Singlecell UL SU-SIMO with one propagation path Analog RF combiner only Actor produces an estimated AoA Achieves MRC which is optimal First few iterations are for exploration, and then it quickly converges the global optimal (MRC, EGC) Settings 16 BS antennas, 1 users, 1 propagation path 5 Fully Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems Using Deep Reinforcement Learning
Simulation Results Singlecell DL SU-MISO with multiple propagation paths Analog & digital beamformer Actor produces estimated AoDs and matrix entries Analog beamformer is already parametrized by AoDs RF chain + However, digital beamformer needs to be designed with power constraint, e.g., ?????? ? Otherwise, actor just grow up matrix entries to maximize reward. Current approach is to normalize ??? such that power constraint is satisfied. Digital Beamformer ??? ?1 2= ?. ? = ??RF?BB? + ? RF chain + ??? ??? Does not guarantee global optimum Settings 16 BS antennas, 1 users, 2 propagation path 6 Fully Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems Using Deep Reinforcement Learning
Simulation Results Achieved SNR vs. Iterations Takes more iterations to converge. Unstable convergence. After several iterations, it goes back to random guess because of exploit-exploitation. Second convergence approaches the same value. SVD-based method violates unit-modular constraint of analog beamformer. Comparing other existing methods will be required. Settings 16 BS antennas, 1 users, 2 propagation paths 7 Fully Autonomous Hybrid Beamforming for mmWave Massive MIMO Systems Using Deep Reinforcement Learning