Finding
Paper
Abstract
Although a number of algorithms exist for the generation of the fundamental frequency contour in automatic text-to-speech conversion systems, the absence of a general theory of intonation still prevents the correct derivation of this important feature in unrestricted text applications. A parallel distributed approach is presented in which two neural networks were designed to learn the F0 values for each phoneme and the F0 fluctuations within each phoneme for words that correspond to a small training set. The neural networks used for this task have demonstrated the ability to generalize their properties on new text, and their level of success depends on the composition and size of the training corpus.<
Authors
M. Scordilis, J. Gowdy
Journal
International Conference on Acoustics, Speech, and Signal Processing,