Tóm tắt:
In this paper, improving naturalness HMM-based speech synthesis for Vietnamese language is described. By this synthesis method, trajectories of speech parameters are generated from the trained Hidden Markov models. A final speech waveform is synthesized from those speech parameters. The main objective for the development is to achieve maximum naturalness in output speech through key points. Firstly, system uses a high quality recorded Vietnamese speech database appropriate for training, especially in statistical parametric model approach. Secondly, prosodic information such as tone, POS (part of speech) and features based on characteristics of Vietnamese language are added to ensure the quality of synthetic speech. Third, system uses STRAIGHT which showed its ability to produce high-quality voice manipulation and was successfully incorporated into HMM-based speech synthesis. The results collected show that the speech produced by our system has the best result when being compared with the other Vietnamese TTS systems trained from the same speech data.
Tác giả: Son Trinh; Kiem Hoang
Từ khóa: Hidden Markov models, Speech, Speech synthesis, Context, Databases, Training, Context modeling
Tạp chí: International Journal of Software Innovation (IJSI)
Chỉ số: ISSN: 2166-7160; ESCI, Scorpus
SAIGON INTERNATIONAL UNIVERSITY (SIU) THAODIEN CAMPUS
Lewis Hall: 8C Tống Hữu Định, Phường Thảo Điền, TP.Thủ Đức, TPHCM, Việt Nam
Eliot Hall: 7, 9 Tống Hữu Định, Phường Thảo Điền, TP.Thủ Đức, TPHCM, Việt Nam
McCarthy Hall: 10 Tống Hữu Định, Phường Thảo Điền, TP.Thủ Đức, TPHCM, Việt Nam
Fleming Hall: 16 Tống Hữu Định, Phường Thảo Điền, TP.Thủ Đức, TPHCM, Việt Nam
Đông A Hall: 18 Tống Hữu Định, Phường Thảo Điền, TP.Thủ Đức, TPHCM, Việt Nam
SIU GRADUATE SCHOOL
11 Tống Hữu Định, Phường Thảo Điền, TP.Thủ Đức, TPHCM, Việt Nam
226A Pasteur, Phường Võ Thị Sáu, Quận 3, TPHCM, Việt Nam
Hotline: 0933180765; 0985610648
Tel: 028.36203932 (ext. 200)
Email: siug@siu.edu.vn