(1)

Zhu, L.; Guo, F.; Cai, G.; Ma, Y. Structured Preference Modeling for Reinforcement Learning-Based Fine-Tuning of Large Models. JCTS 2025, 4.