System and method for continuous multimodal speech and gesture interaction

发明授权

US09710223B2 System and method for continuous multimodal speech and gesture interaction 有权

请登陆查看更多内容

专利标题： System and method for continuous multimodal speech and gesture interaction
申请号： US14875105

申请日： 2015-10-05
公开(公告)号： US09710223B2

公开(公告)日： 2017-07-18
发明人: Michael Johnston , Derya Ozkan
申请人： Nuance Communications, Inc.
申请人地址： US MA Burlington
专利权人： Nuance Communications, Inc.
当前专利权人： Nuance Communications, Inc.
当前专利权人地址： US MA Burlington
主分类号： G10L15/22
IPC分类号： G10L15/22 ; G06F3/16 ; G06F3/01

System and method for continuous multimodal speech and gesture interaction

摘要：

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

公开/授权文献

US20160026434A1 SYSTEM AND METHOD FOR CONTINUOUS MULTIMODAL SPEECH AND GESTURE INTERACTION 公开/授权日：2016-01-28

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/22	.在语音识别过程中（例如在人机对话过程中）使用的程序