Multi-sensory speech detection system
    1.
    发明授权
    Multi-sensory speech detection system 失效
    多感官语音检测系统

    公开(公告)号:US07383181B2

    公开(公告)日:2008-06-03

    申请号:US10629278

    申请日:2003-07-29

    IPC分类号: G10L15/00

    摘要: The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.

    摘要翻译: 本发明将常规音频麦克风与基于输入提供语音传感器信号的附加话音传感器组合。 语音传感器信号基于语音中的扬声器在诸如面部运动,骨骼振动,喉部振动,喉部阻抗变化等中的动作而产生。语音检测器组件从语音传感器接收输入并输出语音检测 指示用户是否正在说话的信号。 语音检测器基于麦克风信号和语音传感器信号产生语音检测信号。

    Method and apparatus for multi-sensory speech enhancement
    2.
    发明授权
    Method and apparatus for multi-sensory speech enhancement 有权
    多感官语音增强的方法和装置

    公开(公告)号:US07447630B2

    公开(公告)日:2008-11-04

    申请号:US10724008

    申请日:2003-11-26

    IPC分类号: G10L21/02

    摘要: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.

    摘要翻译: 一种方法和系统使用从除空气传导麦克风以外的传感器接收的替代传感器信号来估计干净的语音值。 该估计单独使用替代传感器信号,或者与导气麦克风信号一起使用。 无需使用从空气传导麦克风收集的噪声训练数据训练的模型来估计干净的语音值。 在一个实施例中,校正矢量被添加到由替代传感器信号形成的矢量中,以形成滤波器,该滤波器被施加到空气传导麦克风信号以产生干净的语音估计。 在其他实施例中,语音信号的音调由替代传感器信号确定,并用于分解空气传导麦克风信号。 然后使用分解的信号来确定干净的信号估计。

    Method and system of runtime acoustic unit selection for speech synthesis
    3.
    发明授权
    Method and system of runtime acoustic unit selection for speech synthesis 失效
    用于语音合成的运行时音单元选择的方法和系统

    公开(公告)号:US5913193A

    公开(公告)日:1999-06-15

    申请号:US648808

    申请日:1996-04-30

    CPC分类号: G10L13/07

    摘要: The present invention pertains to a concatenative speech synthesis system and method which produces a more natural sounding speech. The system provides for multiple instances of each acoustic unit which can be used to generate a speech waveform representing an linguistic expression. The multiple instances are formed during an analysis or training phase of the synthesis process and are limited to a robust representation of the highest probability instances. The provision of multiple instances enables the synthesizer to select the instance which closely resembles the desired instance thereby eliminating the need to alter the stored instance to match the desired instance. This in essence minimizes the spectral distortion between the boundaries of adjacent instances thereby producing more natural sounding speech.

    摘要翻译: 本发明涉及一种产生更自然的声音语音的级联语音合成系统和方法。 该系统提供每个声学单元的多个实例,其可用于生成表示语言表达式的语音波形。 多个实例在合成过程的分析或训练阶段期间形成,并且被限制为最高概率实例的鲁棒表示。 提供多个实例使得合成器能够选择非常类似于期望实例的实例,从而消除了改变存储的实例以匹配所需实例的需要。 这实质上使相邻实例的边界之间的频谱失真最小化,从而产生更自然的声音语音。

    Text-to-speech using clustered context-dependent phoneme-based units
    4.
    发明授权
    Text-to-speech using clustered context-dependent phoneme-based units 失效
    使用基于上下文的基于音素的单元的文本到语音

    公开(公告)号:US6163769A

    公开(公告)日:2000-12-19

    申请号:US949138

    申请日:1997-10-02

    IPC分类号: G10L13/06 G10L13/00

    CPC分类号: G10L13/07

    摘要: A text-to-speech system includes a storage device for storing a clustered set of context-dependent phoneme-based units of a target speaker. In one embodiment, decision trees are used wherein each decision tree based context-dependent phoneme-based unit is arranged based on context of at least one immediately preceding and succeeding phoneme. At least one of the context-dependent phoneme-based units represents other non-stored context-dependent phoneme units of similar sound due to similar contexts. A text analyzer obtains a string of phonetic symbols representative of text to be converted to speech. A concatenation module selects stored decision tree based context-dependent phoneme-based units from the set decision tree based context-dependent phoneme-based units based on the context of the phonetic symbols and synthesizes the selected phoneme-based units to generate speech corresponding to the text.

    摘要翻译: 文本到语音系统包括用于存储目标说话者的基于上下文的基于音素的单元的聚集集合的存储设备。 在一个实施例中,使用决策树,其中基于上下文的基于音素的单元的每个基于决策树的单元基于至少一个紧接在前和后面的音素的上下文来排列。 基于上下文的基于音素的单元中的至少一个单元表示由于类似的上下文而具有类似声音的其他未存储的上下文相关音素单元。 文本分析器获得代表要转换为语音的文本的语音符号串。 级联模块基于语音符号的上下文从基于上下文的基于音素的单元中选择存储的基于决策树的基于上下文的基于音素的基于单元的基于上下文的基于音素的单元,并且合成所选择的基于音素的单元以产生对应于 文本。

    FORCE-FEEDBACK WITHIN TELEPRESENCE
    6.
    发明申请
    FORCE-FEEDBACK WITHIN TELEPRESENCE 有权
    电报中的反馈

    公开(公告)号:US20100306647A1

    公开(公告)日:2010-12-02

    申请号:US12472579

    申请日:2009-05-27

    IPC分类号: G06F3/01 G06F3/048

    CPC分类号: G06F3/016

    摘要: The claimed subject matter provides a system and/or a method that facilitates replicating a telepresence session with a real world physical meeting. A telepresence session can be initiated within a communication framework that includes two or more virtually represented users that communicate therein. A trigger component can monitor the telepresence session in real time to identify a participant interaction with an object, wherein the object is at least one of a real world physical object or a virtually represented object within the telepresence session. A feedback component can implement a force feedback to at least one participant within the telepresence session based upon the identified participant interaction with the object, wherein the force feedback is employed via a device associated with at least one participant.

    摘要翻译: 所要求保护的主题提供了一种有助于利用真实世界物理会议复制远程呈现会话的系统和/或方法。 可以在通信框架内启动远程呈现会话,该通信框架包括在其中通信的两个或更多虚拟表示的用户。 触发组件可以实时地监视远程呈现会话,以识别与对象的参与者交互,其中对象是远程呈现会话中的真实世界物理对象或虚拟表示对象中的至少一个。 基于所识别的参与者与对象的交互,反馈组件可以向远程呈现会话中的至少一个参与者实施强制反馈,其中通过与至少一个参与者相关联的设备来采用力反馈。

    Use of a unified language model
    8.
    发明授权
    Use of a unified language model 失效
    使用统一的语言模型

    公开(公告)号:US07013265B2

    公开(公告)日:2006-03-14

    申请号:US11003121

    申请日:2004-12-03

    IPC分类号: G06F17/27 G10L15/18 G10L11/00

    CPC分类号: G10L15/193 G10L15/197

    摘要: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.

    摘要翻译: 语言处理系统包括统一的语言模型。 统一语言模型包括具有表示语义或句法概念和终端的非终端令牌的多个无上下文语法,以及具有非终端令牌的N-gram语言模型。 能够接收指示语言的输入信号的语言处理模块访问统一语言模型以识别语言。 语言处理模块根据统一语言模型的单词生成接收到的语言的假设和/或提供指示语言的输出信号以及其中包含的至少一些语义或句法概念。

    Information retrieval and speech recognition based on language models
    9.
    发明授权
    Information retrieval and speech recognition based on language models 失效
    基于语言模型的信息检索和语音识别

    公开(公告)号:US06418431B1

    公开(公告)日:2002-07-09

    申请号:US09050286

    申请日:1998-03-30

    IPC分类号: G06F1730

    摘要: A language model is used in a speech recognition system which has access to a first, smaller data store and a second, larger data store. The language model is adapted by formulating an information retrieval query based on information contained in the first data store and querying the second data store. Information retrieved from the second data store is used in adapting the language model. Also, language models are used in retrieving information from the second data store. Language models are built based on information in the first data store, and based on information in the second data store. The perplexity of a document in the second data store is determined, given the first language model, and given the second language model. Relevancy of the document is determined based upon the first and second perplexities. Documents are retrieved which have a relevancy measure that exceeds a threshold level.

    摘要翻译: 一种语言模型用于能够访问第一个较小的数据存储和第二个更大数据存储的语音识别系统。 通过基于包含在第一数据存储器中的信息并查询第二数据存储器来制定信息检索查询来适应语言模型。 从第二数据存储器检索的信息用于适应语言模型。 此外,语言模型用于从第二数据存储检索信息。 语言模型是基于第一数据存储中的信息构建的,并且基于第二数据存储中的信息。 在给定第一语言模型并给出第二语言模型的情况下,确定第二数据存储中的文档的困惑度。 基于第一和第二困惑来确定文档的相关性。 检索具有超过阈值水平的相关性度量的文档。

    Extensible speech recognition system that provides a user with audio
feedback
    10.
    发明授权
    Extensible speech recognition system that provides a user with audio feedback 失效
    可扩展语音识别系统,为用户提供音频反馈

    公开(公告)号:US5933804A

    公开(公告)日:1999-08-03

    申请号:US833916

    申请日:1997-04-10

    CPC分类号: G10L15/063 G10L2015/0638

    摘要: A speech recognition system is extensible in that new terms may be added to a list of terms that are recognized by the speech recognition system. The speech recognition system provides audio feedback when new terms are added so that a user may hear how the system expects the word to be pronounced. The user may then accept the pronunciation or provide his own pronunciation. The user may also selectively change the pronunciation of words to avoid misrecognitions by the system. The system may provide appropriate user interface elements for enabling a user to change the pronunciation of words. The system may also include intelligence for automatically changing the pronunciation of words used in recognition based upon empirically derived information.

    摘要翻译: 语音识别系统是可扩展的,因为可以将新术语添加到由语音识别系统识别的术语列表中。 当添加新术语时,语音识别系统提供音频反馈,使得用户可以听到系统如何预期该单词被发音。 用户可以接受发音或提供自己的发音。 用户还可以选择性地改变单词的发音,以避免系统误认识。 系统可以提供适当的用户界面元素,以使用户能够改变单词的发音。 该系统还可以包括基于经验导出的信息来自动改变识别中使用的单词的发音的智能。