专利检索 ap:("Nuance Communications, Inc.") AND inv:"Derya Ozkan" 第 1 页

1.

发明授权
System and method for continuous multimodal speech and gesture interaction 有权

公开(公告)号：US09710223B2

公开(公告)日：2017-07-18

申请号：US14875105

申请日：2015-10-05

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G10L15/22 , G06F3/16 , G06F3/01

CPC分类号： G06F3/167 , G06F3/017 , G06F2203/0381 , G10L15/22 , G10L2015/223

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

2.

发明授权
System and method for continuous multimodal speech and gesture interaction 有权

公开(公告)号：US10540140B2

公开(公告)日：2020-01-21

申请号：US15651315

申请日：2017-07-17

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G10L15/22 , G06F3/16 , G06F3/01

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

3.

发明授权
System and method for continuous multimodal speech and gesture interaction 有权

公开(公告)号：US11189288B2

公开(公告)日：2021-11-30

申请号：US16743117

申请日：2020-01-15

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G10L15/22 , G06F3/01 , G06F3/16

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

4.

发明申请
SYSTEM AND METHOD FOR CONTINUOUS MULTIMODAL SPEECH AND GESTURE INTERACTION 审中-公开

公开(公告)号：US20200150921A1

公开(公告)日：2020-05-14

申请号：US16743117

申请日：2020-01-15

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G06F3/16 , G10L15/22 , G06F3/01

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.