Continuous Control-Based Load Balancing for Distributed Systems Using TD3 Reinforcement Learning
Published 2024-09-30
How to Cite

This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
This paper addresses the issue of resource waste and performance degradation caused by load imbalance in distributed systems. It proposes a load balancing optimization method based on the TD3 (Twin Delayed Deep Deterministic Policy Gradient) algorithm. By modeling system scheduling as a Markov Decision Process with a continuous action space, the agent can dynamically generate task migration ratios based on system states, enabling precise control over multi-node resource allocation. A state space is designed that includes indicators such as CPU utilization and task queue length. A reward function is constructed to comprehensively account for latency, resource utilization, and migration overhead. In multiple experimental scenarios, the proposed method outperforms mainstream algorithms such as Q-Learning, DQN, and PPO in terms of average latency, resource usage, and scheduling robustness. Further tests are conducted in environments involving multi-resource collaborative scheduling and node failure disturbances, validating the strategy's stability and adaptability. The experiments also demonstrate the convergence of the model during training, indicating that the proposed strategy is highly trainable and generalizable. It effectively enhances the overall scheduling efficiency of distributed systems.