专利检索 ap:("Nuance Communications, Inc.") AND inv:"Michael Johnston" 第 1 页

1.

发明授权
System and method for speech-enabled access to media content by a ranked normalized weighted graph using speech recognition 有权

公开(公告)号：US09792086B2

公开(公告)日：2017-10-17

申请号：US14960912

申请日：2015-12-07

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Ebrahim Kazemzadeh

IPC分类号： G10L15/00 , G10L21/00 , H04N5/445 , G06F7/08 , G06F17/30 , G10L15/197 , H04N21/422 , H04N21/466 , H04N21/482 , H04N21/84 , G10L15/08 , G10L15/26 , G10L15/06

CPC分类号： G06F7/08 , G06F17/30026 , G06F17/30743 , G06F17/30784 , G10L15/06 , G10L15/063 , G10L15/08 , G10L15/197 , G10L15/265 , G10L2015/0633 , H04N5/445 , H04N21/42203 , H04N21/4662 , H04N21/4668 , H04N21/47214 , H04N21/4828 , H04N21/84

摘要： Disclosed herein are systems, methods, and computer-readable storage media for generating a speech recognition model for a media content retrieval system. The method causes a computing device to retrieve information describing media available in a media content retrieval system, construct a graph that models how the media are interconnected based on the retrieved information, rank the information describing the media based on the graph, and generate a speech recognition model based on the ranked information. The information can be a list of actors, directors, composers, titles, and/or locations. The graph that models how the media are interconnected can further model pieces of common information between two or more media. The method can further cause the computing device to weight the graph based on the retrieved information, wherein the weighted graph is further normalized to yield a normalized weighted graph to help with speech query searching of media content using speech recognition. The graph can further model relative popularity information in the list. The method can rank information based on a PageRank algorithm.

2.

发明授权
System and method for continuous multimodal speech and gesture interaction 有权

公开(公告)号：US10540140B2

公开(公告)日：2020-01-21

申请号：US15651315

申请日：2017-07-17

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G10L15/22 , G06F3/16 , G06F3/01

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

3.

发明授权
System and method for improving speech recognition accuracy using textual context 有权

公开(公告)号：US10546595B2

公开(公告)日：2020-01-28

申请号：US15911678

申请日：2018-03-05

申请人： Nuance Communications, Inc.

发明人： Dan Melamed , Srinivas Bangalore , Michael Johnston

IPC分类号： G10L15/00 , G10L25/51 , G10L15/19 , G10L17/04 , G10L15/18 , G06F3/16 , G10L15/05 , G10L15/07 , G10L15/30 , G10L15/183 , G10L15/22

摘要： Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.

4.

发明授权
System and method for continuous multimodal speech and gesture interaction 有权

公开(公告)号：US09710223B2

公开(公告)日：2017-07-18

申请号：US14875105

申请日：2015-10-05

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G10L15/22 , G06F3/16 , G06F3/01

CPC分类号： G06F3/167 , G06F3/017 , G06F2203/0381 , G10L15/22 , G10L2015/223

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

5.

发明授权
System and method for continuous multimodal speech and gesture interaction 有权

公开(公告)号：US11189288B2

公开(公告)日：2021-11-30

申请号：US16743117

申请日：2020-01-15

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G10L15/22 , G06F3/01 , G06F3/16

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

6.

发明申请
System and Method for Creating a Presentation Using Natural Language 审中-公开

公开(公告)号：US20180246864A1

公开(公告)日：2018-08-30

申请号：US15964729

申请日：2018-04-27

申请人： NUANCE COMMUNICATIONS, INC.

发明人： Patrick Ehlen , David Crawford GIBBON , Mazin Gilbert , Michael Johnston , Zhu Liu , Behzad Shahraray

IPC分类号： G06F17/24 , G06F3/0482 , G06F17/28 , G06F17/21 , G10L15/22 , G06Q10/10 , G06F17/30 , G06F3/16 , G10L15/26 , G06F3/0481 , H04N5/74 , G11B27/10

CPC分类号： G06F17/24 , G06F3/0481 , G06F3/0482 , G06F3/167 , G06F16/2445 , G06F16/4393 , G06F16/70 , G06F16/951 , G06F16/9535 , G06F17/211 , G06F17/241 , G06F17/28 , G06Q10/10 , G10L15/22 , G10L15/26 , G11B27/105 , H04N5/74

摘要： The invention provides for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. The method embodiment of the invention is a method for the retrieval of information on the basis of its content for incorporation into an electronic presentation. The method comprises receiving from a user a content-based request for at least one segment from a first plurality of segments within a media presentation preprocessed to enable natural language content searchability; in response to the request, presenting a subset of the first plurality of segments to the user; receiving a selection indication from the user associated with at least one segment of the subset of the first plurality of segments and adding the selected at least one segment to a deck for use in a presentation.

7.

发明授权
System and method for creating a presentation using natural language 有权

公开(公告)号：US09959260B2

公开(公告)日：2018-05-01

申请号：US14702825

申请日：2015-05-04

申请人： Nuance Communications, Inc.

发明人： Patrick Ehlen , David Crawford Gibbon , Mazin Gilbert , Michael Johnston , Zhu Liu , Behzad Shahraray

IPC分类号： G06F17/00 , G06F17/24 , G06F17/28 , G06F17/21 , G06F3/0482 , H04N5/74 , G10L15/26 , G06F3/0481 , G06F17/30 , G10L15/22 , G06F3/16 , G11B27/10 , G06Q10/10

CPC分类号： G06F17/24 , G06F3/0481 , G06F3/0482 , G06F3/167 , G06F17/211 , G06F17/241 , G06F17/28 , G06F17/30056 , G06F17/30418 , G06F17/30781 , G06F17/30864 , G06F17/30867 , G06Q10/10 , G10L15/22 , G10L15/26 , G11B27/105 , H04N5/74

摘要： The invention provides for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. The method embodiment of the invention is a method for the retrieval of information on the basis of its content for incorporation into an electronic presentation. The method comprises receiving from a user a content-based request for at least one segment from a first plurality of segments within a media presentation preprocessed to enable natural language content searchability; in response to the request, presenting a subset of the first plurality of segments to the user; receiving a selection indication from the user associated with at least one segment of the subset of the first plurality of segments and adding the selected at least one segment to a deck for use in a presentation.

8.

发明申请
SYSTEM AND METHOD FOR CONTINUOUS MULTIMODAL SPEECH AND GESTURE INTERACTION 审中-公开

公开(公告)号：US20200150921A1

公开(公告)日：2020-05-14

申请号：US16743117

申请日：2020-01-15

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Derya Ozkan

IPC分类号： G06F3/16 , G10L15/22 , G06F3/01

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

9.

发明授权
System and method for speech-enabled access to media content by a ranked normalized weighted graph using speech recognition 有权

公开(公告)号：US10114612B2

公开(公告)日：2018-10-30

申请号：US15784612

申请日：2017-10-16

申请人： Nuance Communications, Inc.

发明人： Michael Johnston , Ebrahim Kazemzadeh

IPC分类号： G10L15/00 , G10L21/00 , H04N5/445 , G06F7/08 , G10L15/08 , G06F17/30 , H04N21/84 , H04N21/482 , H04N21/466 , H04N21/422 , G10L15/197 , G10L15/06 , G10L15/26 , H04N21/472

摘要： Disclosed herein are systems, methods, and computer-readable storage media for generating a speech recognition model for a media content retrieval system. The method causes a computing device to retrieve information describing media available in a media content retrieval system, construct a graph that models how the media are interconnected based on the retrieved information, rank the information describing the media based on the graph, and generate a speech recognition model based on the ranked information. The information can be a list of actors, directors, composers, titles, and/or locations. The graph that models how the media are interconnected can further model pieces of common information between two or more media. The method can further cause the computing device to weight the graph based on the retrieved information, wherein the weighted graph is further normalized weighted graph to help with speech query searching of media content using speech recognition. The graph can further model relative popularity information in the list. The method can rank information based on a PageRank algorithm.

10.

发明授权
System and method for improving speech recognition accuracy using textual context 有权

公开(公告)号：US09911437B2

公开(公告)日：2018-03-06

申请号：US15146283

申请日：2016-05-04

申请人： Nuance Communications, Inc.

发明人： Dan Melamed , Srinivas Bangalore , Michael Johnston

IPC分类号： G10L15/00 , G10L25/51 , G10L15/19 , G10L17/04 , G10L15/18 , G06F3/16 , G10L15/05 , G10L15/07 , G10L15/30 , G10L15/183

CPC分类号： G10L25/51 , G06F3/162 , G10L15/05 , G10L15/07 , G10L15/18 , G10L15/183 , G10L15/19 , G10L15/30 , G10L17/04 , G10L2015/228

摘要： Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类