A Novel Resource Management Framework for Blockchain-Based Federated Learning in IoT Networks

IEEE Transactions on Sustainable Computing(2024)

引用 0|浏览2
暂无评分
摘要
At present, the centralized learning models, used for IoT applications generating large amount of data, face several challenges such as bandwidth scarcity, more energy consumption, increased uses of computing resources, poor connectivity, high computational complexity, reduced privacy, and large latency towards data transfer. In order to address the aforementioned challenges, Blockchain-Enabled Federated Learning Networks (BFLNs) emerged recently, which deal with trained model parameters only, rather than raw data. BFLNs provide enhanced security along with improved energy-efficiency and Quality-of-Service (QoS). However, BFLNs suffer with the challenges of exponential increased action space in deciding various parameter levels towards training and block generation. Motivated by aforementioned challenges of BFLNs, in this work, we are proposing an actor-critic Reinforcement Learning (RL) method to model the Machine Learning Model Owner (MLMO) in selecting the optimal set of parameter levels, addressing the challenges of exponential grow of action space in BFLNs. Further, due to the implicit entropy exploration, actor-critic RL method balances the exploration-exploitation trade-off and shows better performance than most off-policy methods, on large discrete action spaces. Therefore, in this work, considering the mobile scenario of the devices, MLMO decides the data and energy levels that the mobile devices use for the training and determine the block generation rate. This leads to minimized system latency and reduced overall cost, while achieving the target accuracy. Specifically, we have used Proximal Policy Optimization (PPO) as an on-policy actor-critic method with it's two variants, one based on Monte Carlo (MC) returns and another based on Generalized Advantage Estimate (GAE). We analyzed that PPO has better exploration and sample efficiency, lesser training time, and consistently higher cumulative rewards, when compared to off-policy Deep Q-Network (DQN).
更多
查看译文
关键词
Actor-critic reinforcement learning,and exploration-exploitation,blockchain,federated learning,internet of things (IoT),queuing theory,resource managemnet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要