专利检索 ap:("Ponani Gopalakrishnan" OR "David Nahamoo" OR "Mukund Panmanabhan" OR "Lazaros Polymenakos") AND inv:"Ponani Gopalakrishnan" 第 1 页

1.

发明授权
Method and apparatus for suppressing background music or noise from the speech input of a speech recognizer 失效
标题翻译：用于从语音识别器的语音输入中抑制背景音乐或噪声的方法和装置

公开(公告)号：US5848163A

公开(公告)日：1998-12-08

申请号：US594679

申请日：1996-02-02

申请人： Ponani Gopalakrishnan , David Nahamoo , Mukund Panmanabhan , Lazaros Polymenakos

发明人： Ponani Gopalakrishnan , David Nahamoo , Mukund Panmanabhan , Lazaros Polymenakos

IPC分类号： G10K11/178 , G10L21/02 , H04R29/00

CPC分类号： G10L21/0208

摘要： A method and apparatus for removing the effect of background music or noise from speech input to a speech recognizer so as to improve recognition accuracy has been devised. Samples of pure music or noise related to the background music or noise that corrupts the speech input are utilized to reduce the effect of the background in speech recognition. The pure music and noise samples can be obtained in a variety of ways. The music or noise corrupted speech input is segmented in overlapping segments and is then processed in two phases: first, the best matching pure music or noise segment is aligned with each speech segment; then a linear filter is built for each segment to remove the effect of background music or noise from the speech input and the overlapping segments are averaged to improve the signal to noise ratio. The resulting acoustic output can then be fed to a speech recognizer.

摘要翻译： 已经设计了一种用于从语音输入到语音识别器中去除背景音乐或噪声的影响以提高识别精度的方法和装置。用于破坏语音输入的背景音乐或噪音相关的纯音乐或噪音的样本被用来减少背景在语音识别中的影响。纯音乐和噪音样本可以通过各种方式获得。音乐或噪声损坏的语音输入被分割成重叠的段，然后分两个阶段进行处理：首先，最佳匹配的纯音乐或噪声段与每个语音段对齐; 然后为每个段构建线性滤波器，以消除来自语音输入的背景音乐或噪声的影响，并且重叠的段被平均以提高信噪比。然后，所得到的声输出可以被馈送到语音识别器。

2.

发明授权
Transcription of speech data with segments from acoustically dissimilar environments 失效
标题翻译：用来自声学不同环境的片段转录语音数据

公开(公告)号：US6067517A

公开(公告)日：2000-05-23

申请号：US595722

申请日：1996-02-02

申请人： Lalit Rai Bahl , Ponani Gopalakrishnan , Ramesh Ambat Gopinath , Stephane Herman Maes , Mukund Panmanabhan , Lazaros Polymenakos

发明人： Lalit Rai Bahl , Ponani Gopalakrishnan , Ramesh Ambat Gopinath , Stephane Herman Maes , Mukund Panmanabhan , Lazaros Polymenakos

IPC分类号： G10L15/20

CPC分类号： G10L15/20

摘要： A technique to improve the recognition accuracy when transcribing speech data that contains data from a wide range of environments. Input data in many situations contains data from a variety of sources in different environments. Such classes include: clean speech, speech corrupted by noise (e.g., music), non-speech (e.g., pure music with no speech), telephone speech, and the identity of a speaker. A technique is described whereby the different classes of data are first automatically identified, and then each class is transcribed by a system that is made specifically for it. The invention also describes a segmentation algorithm that is based on making up an acoustic model that characterizes the data in each class, and then using a dynamic programming algorithm (the viterbi algorithm) to automatically identify segments that belong to each class. The acoustic models are made in a certain feature space, and the invention also describes different feature spaces for use with different classes.

摘要翻译： 一种在转录包含来自广泛环境的数据的语音数据时提高识别精度的技术。在许多情况下，输入数据包含来自不同环境的各种数据源。这样的课程包括：干净的语音，由噪声（例如，音乐），非语音（例如，没有语音的纯音乐），电话语音和扬声器的身份损坏的语音。描述了一种技术，其中首先自动识别不同类别的数据，然后每个类由专门为其制定的系统进行转录。本发明还描述了基于构成表征每个类中的数据的声学模型，然后使用动态规划算法（维特比算法）来自动识别属于每个类的段的分段算法。声学模型是在某个特征空间中制成的，本发明还描述了用于不同类别的不同特征空间。

3.

发明授权
Conversational computing via conversational virtual machine 有权
标题翻译：通过对话虚拟机进行会话计算

公开(公告)号：US07729916B2

公开(公告)日：2010-06-01

申请号：US11551901

申请日：2006-10-23

申请人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

发明人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

IPC分类号： G10L15/22 , G10L15/28

CPC分类号： H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , G10L2015/228 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

摘要： A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

摘要翻译： 一种对话计算系统，其跨越多个会话感知应用（11）（即，“说”对话协议的应用）和常规应用（12）提供通用协调多模态对话用户界面（CUI）10。对话感知应用（11）通过对话应用API（13）与对话内核（14）通信。会话核心14基于其注册的对话能力和需求来控制应用和设备（本地和网络）之间的对话，并提供统一的对话用户界面和对话服务和行为。对话计算系统可以构建在常规操作系统和API（15）和常规设备硬件（16）之上。对话内核（14）处理所有I / O处理和控制对话引擎（18）。会话内核（14）将语音请求转换为查询，并将会话引擎（18）和会话参数（17）将输出和结果转换为口语消息。对话应用程序API（13）传达对话内核（14）的所有信息，以将查询转换成应用程序调用，并相反地将输出转换为语音，在提供给用户之前进行适当排序。

4.

发明申请
CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE 有权
标题翻译：通过对话虚拟机对话计算

公开(公告)号：US20070043574A1

公开(公告)日：2007-02-22

申请号：US11551901

申请日：2006-10-23

申请人： Daniel Coffman , Liam Comerford , Steven DeGennaro , Edward Epstein , Ponani Gopalakrishnan , Stephan Maes , David Nahamoo

发明人： Daniel Coffman , Liam Comerford , Steven DeGennaro , Edward Epstein , Ponani Gopalakrishnan , Stephan Maes , David Nahamoo

IPC分类号： G10L21/00

CPC分类号： H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , G10L2015/228 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

摘要： A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

摘要翻译： 一种对话计算系统，其跨越多个会话感知应用（11）（即，“说”对话协议的应用）和常规应用（12）提供通用协调多模态对话用户界面（CUI）10。对话感知应用（11）通过对话应用API（13）与对话内核（14）通信。会话核心14基于其注册的对话能力和需求来控制应用和设备（本地和网络）之间的对话，并提供统一的对话用户界面和对话服务和行为。对话计算系统可以构建在常规操作系统和API（15）和常规设备硬件（16）之上。对话内核（14）处理所有I / O处理和控制对话引擎（18）。会话内核（14）将语音请求转换为查询，并将会话引擎（18）和会话参数（17）将输出和结果转换为口语消息。对话应用程序API（13）传达对话内核（14）的所有信息，以将查询转换成应用程序调用，并相反地将输出转换为语音，在提供给用户之前进行适当排序。

5.

发明授权
Conversational computing via conversational virtual machine 失效
标题翻译：通过对话虚拟机进行会话计算

公开(公告)号：US07137126B1

公开(公告)日：2006-11-14

申请号：US09806565

申请日：1999-10-01

申请人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

发明人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

IPC分类号： G06F9/54 , G06F9/50 , G06F9/44 , G10L15/00

CPC分类号： H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , G10L2015/228 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

摘要： A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) (10) across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware maps, applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel (14) controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

摘要翻译： 一种对话计算系统，其跨越多个会话感知应用（11）（即，“说”对话协议的应用“）和常规应用（12）提供通用协调多模态对话用户界面（CUI）（10）。对话感知地图，应用程序（11）通过对话应用程序API（13）与对话内核（14）进行通信。对话内核（14）根据其注册的会话能力和要求，控制应用和设备（本地和网络）之间的对话，并提供统一的会话用户界面和对话服务和行为。对话计算系统可以构建在常规操作系统和API（15）和常规设备硬件（16）之上。对话内核（14）处理所有I / O处理和控制对话引擎（18）。会话内核（14）将语音请求转换为查询，并将会话引擎（18）和会话参数（17）将输出和结果转换为口语消息。对话应用程序API（13）传达对话内核（14）的所有信息，以将查询转换成应用程序调用，并相反地将输出转换为语音，在提供给用户之前进行适当排序。

6.

发明授权
Conversational computing via conversational virtual machine 有权
标题翻译：通过对话虚拟机进行会话计算

公开(公告)号：US08082153B2

公开(公告)日：2011-12-20

申请号：US12544473

申请日：2009-08-20

申请人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

发明人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

IPC分类号： G10L15/28 , G06F3/16

CPC分类号： H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , G10L2015/228 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

摘要： A method for conversational computing includes executing code embodying a conversational virtual machine, registering a plurality of input/output resources with a conversational kernel, providing an interface between a plurality of active applications and the conversational kernel processing input/output data, receiving input queries and input events of a multi-modal dialog across a plurality of user interface modalities of the plurality of active applications, generating output messages and output events of the multi-modal dialog in connection with the plurality of active applications, managing, by the conversational kernel, a context stack associated with the plurality of active applications and the multi-modal dialog to transform the input queries into application calls for the plurality of active applications and convert the output messages into speech, wherein the context stack accumulates a context of each of the plurality of active applications.

摘要翻译： 一种用于对话计算的方法包括执行体现对话虚拟机的代码，用对话内核注册多个输入/输出资源，提供多个活动应用与对话内核处理输入/输出数据之间的接口，接收输入查询和通过多个活动应用程序的多个用户界面模式输入多模态对话的事件，生成与多个活动应用相关联的多模式对话的输出消息和输出事件，由对话内核管理，与所述多个活动应用相关联的上下文栈以及将所述输入查询转换为所述多个活动应用的应用调用并将所述输出消息转换为语音的所述多模态对话，其中，所述上下文堆栈累积所述多个活动应用中的每一个的上下文的活跃应用。

7.

发明申请
CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE 有权
标题翻译：通过对话虚拟机对话计算

公开(公告)号：US20090313026A1

公开(公告)日：2009-12-17

申请号：US12544473

申请日：2009-08-20

申请人： Daniel Coffman , Liam D. Comeford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

发明人： Daniel Coffman , Liam D. Comeford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo

IPC分类号： G10L15/22

CPC分类号： H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , G10L2015/228 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

摘要： A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

摘要翻译： 一种对话计算系统，其跨越多个会话感知应用（11）（即，“说”对话协议的应用）和常规应用（12）提供通用协调多模态对话用户界面（CUI）10。对话感知应用（11）通过对话应用API（13）与对话内核（14）通信。会话核心14基于其注册的会话能力和需求来控制应用和设备（本地和网络）之间的对话，并提供统一的对话用户界面和对话服务和行为。对话计算系统可以构建在常规操作系统和API（15）和常规设备硬件（16）之上。对话内核（14）处理所有I / O处理和控制对话引擎（18）。会话内核（14）将语音请求转换为查询，并将会话引擎（18）和会话参数（17）将输出和结果转换为口语消息。对话应用程序API（13）传达对话内核（14）的所有信息，以将查询转换成应用程序调用，并相反地将输出转换为语音，在提供给用户之前进行适当排序。

8.

发明授权
State-dependent speaker clustering for speaker adaptation 失效
标题翻译：用于说话者适应的状态依赖的扬声器聚类

公开(公告)号：US5787394A

公开(公告)日：1998-07-28

申请号：US572223

申请日：1995-12-13

申请人： Lalit Rai Bahl , Ponani Gopalakrishnan , David Nahamoo , Mukund Padmanabhan

发明人： Lalit Rai Bahl , Ponani Gopalakrishnan , David Nahamoo , Mukund Padmanabhan

IPC分类号： G10L15/06 , G10L5/06

CPC分类号： G10L15/07 , G10L2015/0631

摘要： A system and method for adaptation of a speaker independent speech recognition system for use by a particular user. The system and method gather acoustic characterization data from a test speaker and compare the data with acoustic characterization data generated for a plurality of training speakers. A match score is computed between the test speaker's acoustic characterization for a particular acoustic subspace and each training speaker's acoustic characterization for the same acoustic subspace. The training speakers are ranked for the subspace according to their scores and a new acoustic model is generated for the test speaker based upon the test speaker's acoustic characterization data and the acoustic characterization data of the closest matching training speakers. The process is repeated for each acoustic subspace.

摘要翻译： 一种适用于特定用户使用的独立于说话者的语音识别系统的系统和方法。该系统和方法从测试扬声器收集声学表征数据，并将数据与为多个训练说话者生成的声学特征数据进行比较。在特定声学子空间的测试扬声器的声学特性与相同声学子空间的每个训练说话者的声学特性之间计算匹配分数。训练演讲者根据其分数对子空间进行排名，并且基于测试讲者的声学表征数据和最接近的匹配训练说话者的声学表征数据为测试说话者生成新的声学模型。对于每个声学子空间重复该过程。

9.

发明申请
RESOURCE CONFIGURATION IN MULTI-MODAL DISTRIBUTED COMPUTING SYSTEMS 有权
标题翻译：多模式分布式计算系统中的资源配置

公开(公告)号：US20090094451A1

公开(公告)日：2009-04-09

申请号：US12272597

申请日：2008-11-17

申请人： Ponani Gopalakrishnan , Stephane H. Maes , Ganesh N. Ramaswamy

发明人： Ponani Gopalakrishnan , Stephane H. Maes , Ganesh N. Ramaswamy

IPC分类号： G06F1/24

CPC分类号： H04W4/02 , H04L67/18 , H04W4/20 , H04W8/18

摘要： A method and system for configuring available resources in real-time to automatically accommodate the needs of the system user in multi-modal distributed computing system is disclosed. Information about the location or environment of a wireless device is used, preferably in combination with user personal preferences and past history to modify the behavior of the wireless device, including the selection of the most appropriate mode of interaction with the device and the activation of applications thereon as appropriate.

摘要翻译： 公开了一种实时配置可用资源以自动适应多模态分布式计算系统中系统用户需求的方法和系统。使用关于无线设备的位置或环境的信息，优选地结合用户个人偏好和过去历史来修改无线设备的行为，包括选择与设备的最合适的交互模式以及激活应用在适当的情况下。

10.

发明授权
Resource configuration in multi-modal distributed computing systems 有权
标题翻译：多模式分布式计算系统中的资源配置

公开(公告)号：US07454608B2

公开(公告)日：2008-11-18

申请号：US10698101

申请日：2003-10-31

申请人： Ponani Gopalakrishnan , Stephane H. Maes , Ganesh N. Ramaswamy

发明人： Ponani Gopalakrishnan , Stephane H. Maes , Ganesh N. Ramaswamy

IPC分类号： G06R15/177 , G10L15/00

CPC分类号： H04W4/02 , H04L67/18 , H04W4/20 , H04W8/18

摘要： A method and system for configuring available resources in real-time to automatically accommodate the needs of the system user in multi-modal distributed computing system is disclosed. Information about the location or environment of a wireless device is used, preferably in combination with user personal preferences and past history to modify the behavior of the wireless device, including the selection of the most appropriate mode of interaction with the device and the activation of applications thereon as appropriate.

摘要翻译： 公开了一种实时配置可用资源以自动适应多模态分布式计算系统中系统用户需求的方法和系统。使用关于无线设备的位置或环境的信息，优选地结合用户个人偏好和过去历史来修改无线设备的行为，包括选择与设备的最合适的交互模式以及激活应用在适当的情况下。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类