- 无标题文档
查看论文信息

中文题名:

 时变水声网络中基于强化学习的资源分配研究    

姓名:

 张健    

学号:

 20011210577    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 110503    

学科名称:

 军事通信学    

学生类型:

 硕士    

学位:

 军事学硕士    

学校:

 西安电子科技大学    

院系:

 通信工程学院    

专业:

 军队指挥学    

研究方向:

 特种通信与网络新机制    

第一导师姓名:

 张岗山    

第一导师单位:

  西安电子科技大学    

完成日期:

 2023-03-31    

答辩日期:

 2023-05-29    

外文题名:

 Research On Network Resource Allocation Technology Based On Reinforcement Learning In Time-varying Underwater Acoustic Networks    

中文关键词:

 时变水声网络 ; 资源分配 ; 强化学习 ; 最大交付时延    

外文关键词:

 Time-Varying Underwater Acoustic Networks ; Resource Allocation ; Reinforcement Learning ; Maximum Delivery Delay    

中文摘要:

电磁波在海水中传播时衰减严重,且在海洋环境下建设网络基站非常困难,因此水声通信网络成为水下无线通信的重要手段。在水声通信网络中,节点随时间移动,且使用声波通信设备构建自组织通信网络,以提供相关的服务功能,而资源分配方案(包括路由规划和拥塞控制)则是影响网络服务质量的关键因素。由于在时变网络中,可能并不存在一条从源节点到目的节点的、始终连通的传输路径,所以传统通信网络中的资源分配方案并不适用于时变的水声通信网络。此外,水声通信中存在不可忽略的传播时延,且不同的业务对时延也存在差异化的要求,因此需要研究适合时变水声通信网络的自适应资源分配方法。

另一方面,强化学习作为机器学习的一种方法,可以通过多次试错积累经验,因此常应用于规模较大、模型较复杂的连续决策问题中。针对时变水声通信网络中不同链路资源、连通时隙资源的复杂组合,强化学习能够为网络资源合理分配提供有效的解决方法。

本文在分析现有路由算法与网络拥塞控制机制的基础上,针对时变水声网络环境特性和目标业务特征,首先设计并实现了一种多业务多路径资源分配方案,使用最晚时隙发送路径搜索算法,集中式地为业务依次分配最晚时隙发送路径,在满足业务最大交付时延的情况下,实现网络承载的业务数最大化,以提高网络容量。此外,针对最晚时隙发送路径导致的业务交付时延偏大问题,使用强化学习方法,对初始传输方案做进一步优化,以尽可能减小业务平均交付时延,并实现网络负载均衡。然后,论文还设计并实现了一种分布式时变水声通信网络中的拥塞控制机制,从中转节点“缓存-延迟发送”,以及源节点调整业务不同路径数据量两个方向,使用强化学习算法调整传输方案,保证网络能够动态适应业务数据量突发性变化以及链路条件波动情况,确保业务能够按时交付,并提高网络的稳定性。

最后,本文使用MiniNet仿真平台,结合RYU控制器,对所设计的时变水声网络资源分配系统的有效性进行了验证。实验结果表明,相比于时变网络中经典的接触图路由算法(Contact Graph Routing,CGR),在相同网络环境与业务集合下,本文所规划传输方案能够支持更多的业务按时交付,且优化后的业务交付时延能够接近最早到达路径交付时延。此外,本文系统还能够根据业务数据量变化以及链路条件的波动,动态调整传输方案以保证业务按时交付。

外文摘要:

There is serious attenuation to electromagnetic waves when propagated undersea, and it is very difficult to build network base stations in the marine environment also, thus underwater acoustic communication have become an important way for underwater wireless communication. In underwater acoustic communication networks, nodes may move over time to construct a self-organized communication network with acoustic communication equipment for providing relevant service functions. Meanwhile resource allocation scheme (including route planning and congestion control) becomes a key factor, which affects the quality of network service. However, the resource allocation scheme for the traditional communication network is not suitable for the time-varying underwater acoustic communication network, since there may be no always-connected transmission path existed from the source node to the destination node in a time-varying network. In addition, both propagation delay in underwater acoustic communication and differentiated delay requirements for variant services also cannot be negligible, so it is necessary to study the adaptive resource allocation method suitable for time-varying underwater acoustic communication network.

 

On the other hand, reinforcement learning, as a kind of machine learning method, where experience accumulated by multiple trials and errors, is often used to resolve continuous decision-making problems with larger scale and more complex models. Aiming at the complex combination of different link resources and connected time-slot resources in time-varying underwater acoustic communication networks, reinforcement learning would be an effective solution for the reasonable allocation of network resources.

 

Based on the analysis of the existing routing algorithm and congestion control mechanism, a multi-service and multi-path oriented resource allocation scheme is designed and implemented in this thesis firstly, where characteristics of environment and target traffic was fully considered. In the scheme, the latest slot-sending search algorithm centrally assigns the transmission path with latest time slot to traffics in turn to maximize the number of traffics carried by the network while meeting the maximum delivery delay of services. Furthermore, in order to resolve the problem of additional service delivery delay introduced by the latest slot-sending path, the initial transmission scheme is further optimized by the reinforcement-learning method to reduce the service’s average delivery delay as much as possible and achieve network load balancing. After that, a congestion control mechanism for the distributed time-varying underwater acoustic communication networks is designed and implemented, where the reinforcement-learning algorithm is used to dynamically adjust the transmission scheme by both the transit node and the source node. In the transit node, the cache-delay transmission method is used to adjust the time slot resources, and in the source node, the service throughput is moved to different transmission path. Thus, the network can be dynamically adapted to burst changes of service throughput and fluctuations of link conditions, which ensure that traffics can be delivered on time and improve the network stability.

 

Finally, a MiniNet simulation platform with the RYU controller is used to verify the effectiveness of the time-varying underwater acoustic network resource allocation system. Compared with the classical Contact Graph Routing (CGR) algorithm in time-varying networks, experimental results show that the transmission scheme planned in this thesis can support more services delivered on time, and the optimized delivery delay can be close to that of the earliest arrival path under the same network environment and services. In addition, the system can dynamically adjust the transmission scheme according to changes of service throughput and fluctuations of link conditions to ensure on-time service delivery.

中图分类号:

 TP3    

馆藏号:

 58165    

开放日期:

 2023-12-24    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式