Thatch, C., & Bramwell, L. (2025). Cross-Modal Vision Representation Learning for Real-World Visual Understanding. Journal of Computer Technology and Software, 4(4). https://doi.org/10.5281/zenodo.15340705