Home /Research /Speech synchronization between speech and lip shape movements for service robotics applications
PERCEPTION

Speech synchronization between speech and lip shape movements for service robotics applications

Ren C. Luo, Huang Chien-Chieh, Shu-Ruei Chang, Yi-Jeng Tsai

Year
2011
Citations
2

Abstract

Synchronization between speech and mouth shape includes technologies, such as computer vision, speech synthesis, and speech recognition. We present a method to synchronize the image and the speech, and we use Microsoft's Speech Application Programming Interface (SAPI) to be the speech synthesis tool. Speech animation includes two components, the speech and the image. Speech synthesis output is obtained from Text-to-Speech (TTS), and the images of visemes are generated from software, FaceGen Modeller. Import three key pictures to this software to calibrate and generate the face model. The viseme event handler in C# will connect the image of mouth shape and viseme together. Load the images sequentially and the visemes will one by one match with the images correctly. The main applications of speech synthesis are used as assistive devices, e.g. the use of screen readers for people with visual impairment. A mute person can take advantage of this technology to talk to others. In recent years, speech synthesis is extensively applied in service robotics and entertainment productions such as language learning, education, video games, animations, and music videos.

Keywords

VisemeSpeech synthesisComputer scienceSpeech recognitionSoftwareSynchronization (alternating current)Speech analyticsSpeech technologyService (business)Computer facial animation

Related papers

Browse all PERCEPTION papers