Thatch, Corwin, and Liora Bramwell. “Cross-Modal Vision Representation Learning for Real-World Visual Understanding”. Journal of Computer Technology and Software, vol. 4, no. 4, Apr. 2025, doi:10.5281/zenodo.15340705.