专利检索 ap:("Dan Melamed" OR "Srinivas Bangalore" OR "Michael Johnston") AND inv:"Srinivas Bangalore" 第 1 页

1.

发明授权
System and method for improving speech recognition accuracy using textual context 有权

公开(公告)号：US08571866B2

公开(公告)日：2013-10-29

申请号：US12604628

申请日：2009-10-23

申请人： Dan Melamed , Srinivas Bangalore , Michael Johnston

发明人： Dan Melamed , Srinivas Bangalore , Michael Johnston

IPC分类号： G10L15/06

CPC分类号： G10L25/51 , G06F3/162 , G10L15/05 , G10L15/07 , G10L15/18 , G10L15/183 , G10L15/19 , G10L15/30 , G10L17/04 , G10L2015/228

摘要： Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.

2.

发明申请
SYSTEM AND METHOD FOR IMPROVING SPEECH RECOGNITION ACCURACY USING TEXTUAL CONTEXT 有权
标题翻译：使用文本语境提高语音识别精度的系统和方法

公开(公告)号：US20110099013A1

公开(公告)日：2011-04-28

申请号：US12604628

申请日：2009-10-23

申请人： Dan MELAMED , Srinivas Bangalore , Michael Johnston

发明人： Dan MELAMED , Srinivas Bangalore , Michael Johnston

IPC分类号： G10L15/28 , G10L15/00

CPC分类号： G10L25/51 , G06F3/162 , G10L15/05 , G10L15/07 , G10L15/18 , G10L15/183 , G10L15/19 , G10L15/30 , G10L17/04 , G10L2015/228

摘要： Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.

摘要翻译： 本文公开了用于使用文本上下文改善语音识别精度的系统，方法和计算机可读存储介质。所述方法包括检索记录的话语，从与所述口语对话相关联的设备显示中捕获文本并由一方观看所述记录的话语，以及识别与记录的话语相关的所捕获的文本中的单词。该方法还包括将所识别的词添加到动态语言模型中，并使用动态语言模型来识别记录的话语。记录的话语可以是一个口语对话。时间戳可以分配给每个识别的单词。该方法可以包括基于它们各自的时间戳将识别的词添加到动态语言模型中和/或从动态语言模型中移除所识别的单词。屏幕刮刀可以从与记录的话语相关联的设备显示中捕获文本。设备显示可以包含客户服务数据。

3.

发明授权
Learning edit machines for robust multimodal understanding 有权
标题翻译：学习编辑机器，实现强大的多模态理解

公开(公告)号：US07716039B1

公开(公告)日：2010-05-11

申请号：US11279804

申请日：2006-04-14

申请人： Srinivas Bangalore , Michael Johnston

发明人： Srinivas Bangalore , Michael Johnston

IPC分类号： G06F17/27

CPC分类号： G06F17/289 , G06F17/27

摘要： A system and method are disclosed for processing received data associated with a grammar. The method comprises receiving input data having a characteristic that the input data cannot be assigned an interpretation by a grammar, translating the input data into translated input data and submitting the translated input data into the grammar. The transducer coerces the set of strings encoded in a lattice resulting from recognition (such as speech recognition) to the closest strings in the grammar that can be assigned an interpretation.

摘要翻译： 公开了一种用于处理与语法相关联的接收数据的系统和方法。该方法包括接收输入数据，该输入数据具有输入数据不能被语法解释的特征，将输入数据转换成翻译的输入数据并将翻译的输入数据提交到语法中。传感器强制将由识别（如语音识别）产生的格子编码的字符串集合到语法中可以分配解释的最接近的字符串。

4.

发明授权
System and method for accessing and annotating electronic medical records using a multi-modal interface 有权
标题翻译：使用多模式界面访问和注释电子病历的系统和方法

公开(公告)号：US07499862B1

公开(公告)日：2009-03-03

申请号：US11788890

申请日：2007-04-23

申请人： Srinivas Bangalore , Charles Douglas Blewett , Michael Johnston

发明人： Srinivas Bangalore , Charles Douglas Blewett , Michael Johnston

IPC分类号： G10L21/00

CPC分类号： G06F3/04883 , G06F3/038 , G06F3/167 , G06F19/00 , G10L15/26 , G16H10/60 , G16H15/00 , G16H40/63 , Y10S707/99945 , Y10S707/99948

摘要： A system and method of exchanging medical information between a user and a computer device is disclosed. The computer device can receive user input in one of a plurality of types of user input comprising speech, pen, gesture and a combination of speech, pen and gesture. The method comprises receiving information from the user associated with a medical condition and a bodily location of the medical condition on a patient in one of a plurality of types of user input, presenting in one of a plurality of types of system output an indication of the received medical condition and the bodily location of the medical condition, and presenting to the user an indication that the computer device is ready to receive further information. The invention enables a more flexible multi-modal interactive environment for entering medical information into a computer device. The medical device also generates multi-modal output for presenting a patient's medical condition in an efficient manner.

摘要翻译： 公开了一种在用户和计算机设备之间交换医疗信息的系统和方法。计算机设备可以以包括语音，笔，手势以及语音，笔和手势的组合的多种类型的用户输入中的一种接收用户输入。所述方法包括以多种类型的用户输入中的一种类型的用户输入中的与患者的医疗状况和医疗状况的身体位置相关联的信息，以多种类型的系统中的一种输出，接收到医疗状况和医疗状况的身体位置，并向用户呈现计算机设备准备好接收更多信息的指示。本发明实现了用于将医疗信息输入计算机设备的更灵活的多模态交互环境。医疗装置还产生用于以有效的方式呈现患者的医疗状况的多模式输出。

5.

发明授权
System and method for accessing and annotating electronic medical records using multi-modal interface 有权
标题翻译：使用多模态界面访问和注释电子病历的系统和方法

公开(公告)号：US07225131B1

公开(公告)日：2007-05-29

申请号：US10329123

申请日：2002-12-24

申请人： Srinivas Bangalore , Charles Douglas Blewett , Michael Johnston

发明人： Srinivas Bangalore , Charles Douglas Blewett , Michael Johnston

IPC分类号： G10L21/00

CPC分类号： G06F3/04883 , G06F3/038 , G06F3/167 , G06F19/00 , G10L15/26 , G16H10/60 , G16H15/00 , G16H40/63 , Y10S707/99945 , Y10S707/99948

摘要： A system and method of exchanging medical information between a user and a computer device is disclosed. The computer device can receive user input in one of a plurality of types of user input comprising speech, pen, gesture and a combination of speech, pen and gesture. The method comprises receiving information from the user associated with a medical condition and a bodily location of the medical condition on a patient in one of a plurality of types of user input, presenting in one of a plurality of types of system output an indication of the received medical condition and the bodily location of the medical condition, and presenting to the user an indication that the computer device is ready to receive further information. The invention enables a more flexible multi-modal interactive environment for entering medical information into a computer device. The medical device also generates multi modal output for presenting a patient's medical condition in an efficient manner.

摘要翻译： 公开了一种在用户和计算机设备之间交换医疗信息的系统和方法。计算机设备可以以包括语音，笔，手势以及语音，笔和手势的组合的多种类型的用户输入中的一种接收用户输入。所述方法包括以多种类型的用户输入中的一种类型的用户输入中的与患者的医疗状况和医疗状况的身体位置相关联的信息，以多种类型的系统中的一种输出，接收到医疗状况和医疗状况的身体位置，并向用户呈现计算机设备准备好接收更多信息的指示。本发明实现了用于将医疗信息输入计算机设备的更灵活的多模态交互环境。该医疗装置还产生多模式输出，用于以有效的方式呈现患者的医疗状况。

6.

发明授权
System and method for using prosody for voice-enabled search 有权

公开(公告)号：US10002608B2

公开(公告)日：2018-06-19

申请号：US12884959

申请日：2010-09-17

申请人： Srinivas Bangalore , Junlan Feng , Michael Johnston , Taniya Mishra

发明人： Srinivas Bangalore , Junlan Feng , Michael Johnston , Taniya Mishra

IPC分类号： G10L15/18 , G10L25/54 , G10L25/63 , G10L15/22

CPC分类号： G10L15/1807 , G10L25/54 , G10L25/63 , G10L2015/226 , G10L2015/227

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating relevant responses to a user query with voice-enabled search. A system practicing the method receives a word lattice generated by an automatic speech recognizer based on a user speech and a prosodic analysis of the user speech, generates a reweighted word lattice based on the word lattice and the prosodic analysis, approximates based on the reweighted word lattice one or more relevant responses to the query, and presents to a user the responses to the query. The prosodic analysis examines metalinguistic information of the user speech and can identify the most salient subject matter of the speech, assess how confident a speaker is in the content of his or her speech, and identify the attitude, mood, emotion, sentiment, etc. of the speaker. Other information not described in the content of the speech can also be used.

7.

发明申请
SYSTEM AND METHOD FOR ENHANCING VOICE-ENABLED SEARCH BASED ON AUTOMATED DEMOGRAPHIC IDENTIFICATION 有权
标题翻译：基于自动人口统计学识别提高语音搜索的系统和方法

公开(公告)号：US20120072219A1

公开(公告)日：2012-03-22

申请号：US12888012

申请日：2010-09-22

申请人： Michael JOHNSTON , Srinivas Bangalore , Junlan Feng , Taniya Mishra

发明人： Michael JOHNSTON , Srinivas Bangalore , Junlan Feng , Taniya Mishra

IPC分类号： G10L15/04

CPC分类号： G06F17/30026 , G06F17/30976 , G06F17/30979 , G10L15/22 , G10L2015/227

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information.

摘要翻译： 本文公开的是基于包括说话者的人口统计特征的元数据的用于在基于语音的搜索中近似对用户语音查询的响应的系统，方法和非暂时计算机可读存储介质。实施该方法的系统识别来自扬声器的接收语音以产生识别的语音，从接收到的语音识别关于说话者的元数据，并将识别的语音和元数据馈送到问答引擎。识别关于扬声器的元数据是基于所接收语音的语音特征。人口特征可以包括年龄，性别，社会经济群体，国籍和/或地区。从接收到的语音中识别的关于说话者的元数据可以与自报告的说话者人口统计信息进行组合或覆盖。

8.

发明申请
SYSTEM AND METHOD FOR MULTIMODAL INTERACTION USING ROBUST GESTURE PROCESSING 审中-公开
标题翻译：使用稳健姿态处理的多模式相互作用的系统和方法

公开(公告)号：US20100281435A1

公开(公告)日：2010-11-04

申请号：US12433320

申请日：2009-04-30

申请人： Srinivas Bangalore , Michael Johnston

发明人： Srinivas Bangalore , Michael Johnston

IPC分类号： G06F3/033

CPC分类号： G06F3/038 , G06F3/04883

摘要： Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for multimodal interaction. The method includes receiving a plurality of multimodal inputs associated with a query, the plurality of multimodal inputs including at least one gesture input, editing the at least one gesture input with a gesture edit machine. The method further includes responding to the query based on the edited gesture input and remaining multimodal inputs. The gesture inputs can be from a stylus, finger, mouse, and other pointing/gesture device. The gesture input can be unexpected or errorful. The gesture edit machine can perform actions such as deletion, substitution, insertion, and aggregation. The gesture edit machine can be modeled as a finite-state transducer. In one aspect, the method further includes generating a lattice for each input, generating an integrated lattice of combined meaning of the generated lattices, and responding to the query further based on the integrated lattice.

摘要翻译： 本文公开的是用于多模式交互的系统，计算机实现的方法和有形的计算机可读介质。所述方法包括接收与查询相关联的多个多模式输入，所述多个多模式输入包括至少一个手势输入，用手势编辑机编辑所述至少一个手势输入。该方法还包括基于编辑的手势输入和剩余的多模式输入来响应查询。手势输入可以来自手写笔，手指，鼠标和其他指向/手势设备。手势输入可能是意外或错误的。手势编辑机可以执行删除，替换，插入和聚合等动作。手势编辑机可以被建模为有限状态换能器。一方面，所述方法还包括为每个输入生成格子，产生所产生的格子的组合意义的积分格，以及进一步基于所述积分格来响应所述查询。

9.

发明授权
System and method for automatic identification of key phrases during a multimedia broadcast 有权
标题翻译：在多媒体广播期间自动识别关键短语的系统和方法

公开(公告)号：US08918803B2

公开(公告)日：2014-12-23

申请号：US12823734

申请日：2010-06-25

申请人： Srinivas Bangalore , Mazin E. Gilbert , Michael Johnston

发明人： Srinivas Bangalore , Mazin E. Gilbert , Michael Johnston

IPC分类号： H04H60/32 , G06F17/30 , H04N21/4402 , H04N21/462 , H04N21/8405 , H04N21/235 , H04N21/488 , H04N7/088

CPC分类号： H04N21/44222 , G06F17/30743 , G06F17/30787 , G06F17/30864 , G06F17/30964 , H04N7/0884 , H04N21/2355 , H04N21/435 , H04N21/4394 , H04N21/44008 , H04N21/440236 , H04N21/4532 , H04N21/4622 , H04N21/47815 , H04N21/4828 , H04N21/4884 , H04N21/8405

摘要： An Internet Protocol television system includes a user profile agent, a keyword detection agent, and an information search agent. The user profile agent is in communication with a multimedia device, and generates a user profile based on information received from the multimedia device. The keyword detection agent is in communication with the user profile agent, and searches text associated with a multimedia video stream transmitted to the multimedia device for keywords associated with the user profile. The information search agent is in communication with the keyword detection agent, and connects to an information source associated with the keywords detected by the keyword detection agent, and provides additional information associated with the keywords to the multimedia device.

摘要翻译： 互联网协议电视系统包括用户简档代理，关键字检测代理和信息搜索代理。用户简档代理与多媒体设备通信，并且基于从多媒体设备接收的信息来生成用户简档。关键字检测代理与用户简档代理进行通信，并且搜索与传输到多媒体设备的多媒体视频流相关联的文本，以获得与用户简档相关联的关键字。信息搜索代理与关键字检测代理进行通信，并且连接到与由关键字检测代理检测到的关键字相关联的信息源，并向多媒体设备提供与关键字相关联的附加信息。

10.

发明授权
System and method for enhancing voice-enabled search based on automated demographic identification 有权

公开(公告)号：US08401853B2

公开(公告)日：2013-03-19

申请号：US12888012

申请日：2010-09-22

申请人： Michael Johnston , Srinivas Bangalore , Junlan Feng , Taniya Mishra

发明人： Michael Johnston , Srinivas Bangalore , Junlan Feng , Taniya Mishra

IPC分类号： G10L15/04

CPC分类号： G06F17/30026 , G06F17/30976 , G06F17/30979 , G10L15/22 , G10L2015/227

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类