Research Topic: Multimodal Speech Interaction Technology

 

Technical Area: Speech, Artificial Intelligence, Interaction

 

Background

Human-machine interaction is a bridge which connects users and Internet services and contents. It is one of the most fundamental and sophisticated Internet technologies nowadays. Human-human interaction naturally relies on speech, facial expression and gesture. However, the current human-machine interaction manner is still far behind in terms of naturalness. Recall that the breakthroughs in interaction technology always brought tremendous changes in industry. For example, keyboard and mouse made possible the graphical user interface, and touch screen technology started the era of smart phones. New interaction technologies will be the key to bring the new experience of Internet access in future. Recently, new speech interaction modalities have emerged.

 

However, the new technology is still immature, and the resulted experience is not as natural as expected. For example, a wake-up word is usually required. It is unlikely to work well in a noisy environment and cannot understand spontaneous spoken language, either. Besides, it is very sensitive to speech recognition error.

 

Target

In this topic, we are interested in developing the multimodal speech interaction, i.e., integrating other modalities to speech. By that, we expect the interaction experience could be more robust and natural.

 

Suggested Research Topics: