Vol. 1 No. 1 (2022)
Articles

Advancements in Voice Conversion: Spectrogram-Based Speech Style Transfer Using Convolutional Neural Networks

Published 2022-01-30

How to Cite

Johnson, E. (2022). Advancements in Voice Conversion: Spectrogram-Based Speech Style Transfer Using Convolutional Neural Networks. Journal of Computer Technology and Software, 1(1). Retrieved from https://ashpress.org/index.php/jcts/article/view/39

Abstract

Voice Conversion (VC) transforms the phonetic style of a source speaker to a target speaker while preserving semantic content. This technology has applications in communication, healthcare, entertainment, and security. Traditional methods using neural networks have enhanced speech quality, but current research aims to reduce training data requirements. Inspired by image style transfer, this paper uses convolutional neural networks (CNNs) to extract and stylize spectrogram features from speech signals. The proposed model achieves high-quality speech style transfer, demonstrating CNNs' effectiveness in voice conversion with reduced data dependency.