ZHU, Lin; GUO, Fan; CAI, Guohui; MA, Yumeng. Structured Preference Modeling for Reinforcement Learning-Based Fine-Tuning of Large Models. Journal of Computer Technology and Software, [S. l.], v. 4, n. 4, 2025. DOI: 10.5281/zenodo.15340770. Disponível em: https://ashpress.org/index.php/jcts/article/view/156. Acesso em: 14 may. 2025.