CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE
    1.
    发明申请
    CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE 有权
    通过对话虚拟机对话计算

    公开(公告)号:US20090313026A1

    公开(公告)日:2009-12-17

    申请号:US12544473

    申请日:2009-08-20

    IPC分类号: G10L15/22

    摘要: A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

    摘要翻译: 一种对话计算系统,其跨越多个会话感知应用(11)(即,“说”对话协议的应用)和常规应用(12)提供通用协调多模态对话用户界面(CUI)10。 对话感知应用(11)通过对话应用API(13)与对话内核(14)通信。 会话核心14基于其注册的会话能力和需求来控制应用和设备(本地和网络)之间的对话,并提供统一的对话用户界面和对话服务和行为。 对话计算系统可以构建在常规操作系统和API(15)和常规设备硬件(16)之上。 对话内核(14)处理所有I / O处理和控制对话引擎(18)。 会话内核(14)将语音请求转换为查询,并将会话引擎(18)和会话参数(17)将输出和结果转换为口语消息。 对话应用程序API(13)传达对话内核(14)的所有信息,以将查询转换成应用程序调用,并相反地将输出转换为语音,在提供给用户之前进行适当排序。

    Conversational computing via conversational virtual machine
    2.
    发明授权
    Conversational computing via conversational virtual machine 有权
    通过对话虚拟机进行会话计算

    公开(公告)号:US07729916B2

    公开(公告)日:2010-06-01

    申请号:US11551901

    申请日:2006-10-23

    IPC分类号: G10L15/22 G10L15/28

    摘要: A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

    摘要翻译: 一种对话计算系统,其跨越多个会话感知应用(11)(即,“说”对话协议的应用)和常规应用(12)提供通用协调多模态对话用户界面(CUI)10。 对话感知应用(11)通过对话应用API(13)与对话内核(14)通信。 会话核心14基于其注册的对话能力和需求来控制应用和设备(本地和网络)之间的对话,并提供统一的对话用户界面和对话服务和行为。 对话计算系统可以构建在常规操作系统和API(15)和常规设备硬件(16)之上。 对话内核(14)处理所有I / O处理和控制对话引擎(18)。 会话内核(14)将语音请求转换为查询,并将会话引擎(18)和会话参数(17)将输出和结果转换为口语消息。 对话应用程序API(13)传达对话内核(14)的所有信息,以将查询转换成应用程序调用,并相反地将输出转换为语音,在提供给用户之前进行适当排序。

    Conversational computing via conversational virtual machine
    3.
    发明授权
    Conversational computing via conversational virtual machine 失效
    通过对话虚拟机进行会话计算

    公开(公告)号:US07137126B1

    公开(公告)日:2006-11-14

    申请号:US09806565

    申请日:1999-10-01

    摘要: A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) (10) across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware maps, applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel (14) controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

    摘要翻译: 一种对话计算系统,其跨越多个会话感知应用(11)(即,“说”对话协议的应用“)和常规应用(12)提供通用协调多模态对话用户界面(CUI)(10)。 对话感知地图,应用程序(11)通过对话应用程序API(13)与对话内核(14)进行通信。 对话内核(14)根据其注册的会话能力和要求,控制应用和设备(本地和网络)之间的对话,并提供统一的会话用户界面和对话服务和行为。 对话计算系统可以构建在常规操作系统和API(15)和常规设备硬件(16)之上。 对话内核(14)处理所有I / O处理和控制对话引擎(18)。 会话内核(14)将语音请求转换为查询,并将会话引擎(18)和会话参数(17)将输出和结果转换为口语消息。 对话应用程序API(13)传达对话内核(14)的所有信息,以将查询转换成应用程序调用,并相反地将输出转换为语音,在提供给用户之前进行适当排序。

    CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE
    5.
    发明申请
    CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE 有权
    通过对话虚拟机对话计算

    公开(公告)号:US20070043574A1

    公开(公告)日:2007-02-22

    申请号:US11551901

    申请日:2006-10-23

    IPC分类号: G10L21/00

    摘要: A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

    摘要翻译: 一种对话计算系统,其跨越多个会话感知应用(11)(即,“说”对话协议的应用)和常规应用(12)提供通用协调多模态对话用户界面(CUI)10。 对话感知应用(11)通过对话应用API(13)与对话内核(14)通信。 会话核心14基于其注册的对话能力和需求来控制应用和设备(本地和网络)之间的对话,并提供统一的对话用户界面和对话服务和行为。 对话计算系统可以构建在常规操作系统和API(15)和常规设备硬件(16)之上。 对话内核(14)处理所有I / O处理和控制对话引擎(18)。 会话内核(14)将语音请求转换为查询,并将会话引擎(18)和会话参数(17)将输出和结果转换为口语消息。 对话应用程序API(13)传达对话内核(14)的所有信息,以将查询转换成应用程序调用,并相反地将输出转换为语音,在提供给用户之前进行适当排序。

    Method and apparatus for suppressing background music or noise from the
speech input of a speech recognizer
    6.
    发明授权
    Method and apparatus for suppressing background music or noise from the speech input of a speech recognizer 失效
    用于从语音识别器的语音输入中抑制背景音乐或噪声的方法和装置

    公开(公告)号:US5848163A

    公开(公告)日:1998-12-08

    申请号:US594679

    申请日:1996-02-02

    CPC分类号: G10L21/0208

    摘要: A method and apparatus for removing the effect of background music or noise from speech input to a speech recognizer so as to improve recognition accuracy has been devised. Samples of pure music or noise related to the background music or noise that corrupts the speech input are utilized to reduce the effect of the background in speech recognition. The pure music and noise samples can be obtained in a variety of ways. The music or noise corrupted speech input is segmented in overlapping segments and is then processed in two phases: first, the best matching pure music or noise segment is aligned with each speech segment; then a linear filter is built for each segment to remove the effect of background music or noise from the speech input and the overlapping segments are averaged to improve the signal to noise ratio. The resulting acoustic output can then be fed to a speech recognizer.

    摘要翻译: 已经设计了一种用于从语音输入到语音识别器中去除背景音乐或噪声的影响以提高识别精度的方法和装置。 用于破坏语音输入的背景音乐或噪音相关的纯音乐或噪音的样本被用来减少背景在语音识别中的影响。 纯音乐和噪音样本可以通过各种方式获得。 音乐或噪声损坏的语音输入被分割成重叠的段,然后分两个阶段进行处理:首先,最佳匹配的纯音乐或噪声段与每个语音段对齐; 然后为每个段构建线性滤波器,以消除来自语音输入的背景音乐或噪声的影响,并且重叠的段被平均以提高信噪比。 然后,所得到的声输出可以被馈送到语音识别器。

    State-dependent speaker clustering for speaker adaptation
    7.
    发明授权
    State-dependent speaker clustering for speaker adaptation 失效
    用于说话者适应的状态依赖的扬声器聚类

    公开(公告)号:US5787394A

    公开(公告)日:1998-07-28

    申请号:US572223

    申请日:1995-12-13

    IPC分类号: G10L15/06 G10L5/06

    CPC分类号: G10L15/07 G10L2015/0631

    摘要: A system and method for adaptation of a speaker independent speech recognition system for use by a particular user. The system and method gather acoustic characterization data from a test speaker and compare the data with acoustic characterization data generated for a plurality of training speakers. A match score is computed between the test speaker's acoustic characterization for a particular acoustic subspace and each training speaker's acoustic characterization for the same acoustic subspace. The training speakers are ranked for the subspace according to their scores and a new acoustic model is generated for the test speaker based upon the test speaker's acoustic characterization data and the acoustic characterization data of the closest matching training speakers. The process is repeated for each acoustic subspace.

    摘要翻译: 一种适用于特定用户使用的独立于说话者的语音识别系统的系统和方法。 该系统和方法从测试扬声器收集声学表征数据,并将数据与为多个训练说话者生成的声学特征数据进行比较。 在特定声学子空间的测试扬声器的声学特性与相同声学子空间的每个训练说话者的声学特性之间计算匹配分数。 训练演讲者根据其分数对子空间进行排名,并且基于测试讲者的声学表征数据和最接近的匹配训练说话者的声学表征数据为测试说话者生成新的声学模型。 对于每个声学子空间重复该过程。

    Speech coding via speech recognition and synthesis based on pre-enrolled
phonetic tokens
    8.
    发明授权
    Speech coding via speech recognition and synthesis based on pre-enrolled phonetic tokens 失效
    基于预先录入的语音标记的语音识别和综合语音编码

    公开(公告)号:US6119086A

    公开(公告)日:2000-09-12

    申请号:US67863

    申请日:1998-04-28

    CPC分类号: G10L19/0018

    摘要: A speech coding system, responsive to an input speech signal provided by a system user, comprises: a speech coding portion including a speech recognition system responsive to the input speech signal and having a word vocabulary associated therewith, the speech recognition system recognizing the input speech signal in accordance with the vocabulary and generating phonetic tokens, such as at least one sequence of lefemes, representative of the input speech signal; a channel, responsive to the at least one sequence of lefemes, for transmitting and/or storing the at least one sequence of lefemes; and a speech synthesizing portion, responsive to the transmitted/stored sequence of lefemes, for generating a synthesized speech signal which is representative of the input speech signal provided by the system user using the at least one sequence of lefemes. The speech recognition system preferably generates acoustic parameters from the input speech signal which include voice characteristics of the system user. The speech coding system also preferably comprises a labeler which processes the input speech signal including words uttered by the system user which are not in the word vocabulary associated with the speech recognition system, the labeler generating phonetic tokens, such as at least one sequence of lefemes, optimally representative of the input speech signal. The sequence of lefemes from the labeler and the speech recognition portion are compared, for each speech segment, and the sequence most similar to the input speech is selected for transmission/storage. The speech synthesizing portion of the system preferably performs speech synthesis using pre-enrolled phonetic sub-units or tokens.

    摘要翻译: 响应于由系统用户提供的输入语音信号的语音编码系统包括:语音编码部分,包括响应于输入语音信号并具有与其相关联的词汇词汇的语音识别系统,语音识别系统识别输入语音 信号,并产生语音令牌,例如表示输入语音信号的至少一个左派序列; 响应于所述至少一个左列的序列的信道,用于发送和/或存储所述至少一个左派序列; 以及语音合成部分,响应于所发送/存储的莱佛斯序列,用于产生代表由系统用户使用至少一个左派序列提供的输入语音信号的合成语音信号。 语音识别系统优选地从包括系统用户的语音特征的输入语音信号生成声学参数。 语音编码系统还优选地包括标签器,其处理包括不在与语音识别系统相关联的词汇词汇中的由系统用户发出的单词的输入语音信号,产生语音令牌的标签器,例如至少一个lefemes序列 ,最佳地代表输入语音信号。 对于每个语音段,比较来自标签机和语音识别部分的左派序列,并且选择与输入语音最相似的序列用于传输/存储。 系统的语音合成部分优选地使用预先注册的语音子单元或令牌来执行语音合成。

    Speech coding apparatus with single-dimension acoustic prototypes for a
speech recognizer
    9.
    发明授权
    Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer 失效
    具有用于语音识别器的单维声学原型的语音编码装置

    公开(公告)号:US5280562A

    公开(公告)日:1994-01-18

    申请号:US770495

    申请日:1991-10-03

    CPC分类号: G10L19/038 H03M7/3082

    摘要: In speech recognition and speech coding, the values of at least two features of an utterance are measured during a series of time intervals to produce a series of feature vector signals. A plurality of single-dimension prototype vector signals having only one parameter value are stored. At least two single-dimension prototype vector signals having parameter values representing first feature values, and at least two other single-dimension prototype vector signals have parameter values representing second feature values. A plurality of compound-dimension prototype vector signals have unique identification values and comprise one first-dimension and one second-dimension prototype vector signal. At least two compound-dimension prototype vector signals comprise the same first-dimension prototype vector signal. The feature values of each feature vector signal are compared to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores. The identification values of the compound-dimension prototype vector signals having the best prototype match scores for the feature vectors signals are output as a sequence of coded representations of an utterance to be recognized. A match score, comprising an estimate of the closeness of a match between a speech unit and the sequence of coded representations of the utterance, is generated for each of a plurality of speech units. At least one speech subunit, of one or more best candidate speech units having the best match scores, is displayed.

    摘要翻译: 在语音识别和语音编码中,在一系列时间间隔期间测量话音的至少两个特征的值,以产生一系列特征向量信号。 存储仅具有一个参数值的多个单维原型矢量信号。 具有表示第一特征值的参数值和至少两个其它单维原型矢量信号的至少两个单维原型矢量信号具有表示第二特征值的参数值。 多个复合尺寸原型矢量信号具有唯一的识别值,并且包括一个第一维和一个第二维原型矢量信号。 至少两个复合维度原型矢量信号包括相同的第一维原型矢量信号。 将每个特征向量信号的特征值与化合物维度原型矢量信号的参数值进行比较,以获得原型匹配分数。 具有特征矢量信号的具有最佳原型匹配分数的复合维度原型矢量信号的识别值被输出为将被识别的话语的编码表示的序列。 针对多个语音单元中的每一个生成包括语音单元与语音编码表示序列之间的匹配的接近度的估计的匹配分数。 显示具有最佳匹配分数的一个或多个最佳候选语音单元的至少一个语音子单元。