Speech dialog method and system
    1.
    发明申请
    Speech dialog method and system 有权
    语音对话方法和系统

    公开(公告)号:US20060247921A1

    公开(公告)日:2006-11-02

    申请号:US11118670

    申请日:2005-04-29

    IPC分类号: G10L11/04

    摘要: An electronic device (300) for speech dialog includes functions that receive (305, 105) a speech phrase that comprises a request phrase that includes an instantiated variable (215), generate (335, 115) pitch and voicing characteristics (315) of the instantiated variable, and performs voice recognition (319, 125) of the instantiated variable to determine a most likely set of acoustic states (235). The electronic device may generate (335, 140) a synthesized value of the instantiated variable using the most likely set of acoustic states and the pitch and voicing characteristics of the instantiated variable. The electronic device may use a table of previously entered values of variables that have been determined to be unique, and in which the values are associated with a most likely set of acoustic states and the pitch and voicing characteristics determined at the receipt of each value to disambiguate (425, 430) a newly received instantiated variable.

    摘要翻译: 一种用于语音对话的电子设备(300)包括接收(305,105)语音短语的功能,该语音短语包括包含实例化变量(215)的请求短语,产生(335,115)音调和语音特征(315) 并且执行所述实例化变量的语音识别(319,125)以确定最可能的一组声学状态(235)。 电子设备可以使用最可能的声学状态集合和实例化变量的音调和语音特征来生成(335,140)实例化变量的合成值。 电子设备可以使用已经被确定为唯一的先前输入的变量值的表,并且其中值与最可能的一组声学状态相关联,并且在接收每个值时确定的音高和发声特性 消除歧义(425,430)一个新接收的实例变量。

    Speech dialog method and system
    2.
    发明授权
    Speech dialog method and system 有权
    语音对话方法和系统

    公开(公告)号:US07181397B2

    公开(公告)日:2007-02-20

    申请号:US11118670

    申请日:2005-04-29

    IPC分类号: G10L15/14

    摘要: An electronic device (300) for speech dialog includes functions that receive (305, 105) a speech phrase that comprises a request phrase that includes an instantiated variable (215), generate (335, 115) pitch and voicing characteristics (315) of the instantiated variable, and performs speech recognition (319, 125) of the instantiated variable to determine a most likely set of acoustic states (235). The electronic device may generate (335, 140) a synthesized value of the instantiated variable using the most likely set of acoustic states and the pitch and voicing characteristics of the instantiated variable. The electronic device may use a table of previously entered values of variables that have been determined to be unique, and in which the values are associated with a most likely set of acoustic states and the pitch and voicing characteristics determined at the receipt of each value to disambiguate (425, 430) a newly received instantiated variable.

    摘要翻译: 一种用于语音对话的电子设备(300)包括接收(305,105)语音短语的功能,该语音短语包括包含实例化变量(215)的请求短语,生成(335,115)音调和语音特征(315) 并且执行所述实例化变量的语音识别(319,125)以确定最可能的一组声学状态(235)。 电子设备可以使用最可能的声学状态集合和实例化变量的音调和语音特征来生成(335,140)实例化变量的合成值。 电子设备可以使用已经被确定为唯一的先前输入的变量值的表,并且其中值与最可能的一组声学状态相关联,并且在接收每个值时确定的音高和发声特性 消除歧义(425,430)一个新接收的实例变量。

    METHOD AND APPARATUS FOR SPEECH RECOGNITION
    3.
    发明申请
    METHOD AND APPARATUS FOR SPEECH RECOGNITION 审中-公开
    用于语音识别的方法和装置

    公开(公告)号:US20090259469A1

    公开(公告)日:2009-10-15

    申请号:US12102141

    申请日:2008-04-14

    IPC分类号: G10L15/00 G10L15/02 G10L15/06

    CPC分类号: G10L15/02 G10L15/142

    摘要: A method and apparatus for performing speech recognition receives an audio signal, generates a sequence of frames of the audio signal, transforms each frame of the audio signal into a set of narrow band feature vectors using a narrow passband, couples the narrow band feature vectors to a speech model, and determines whether the audio signal is a wide band signal. When the audio signal is determined to be a wide band signal, a pass band parameter of each of one or more passbands that are outside the narrow passband is generated for each frame and the one or more band energy parameters are coupled to the speech model.

    摘要翻译: 用于执行语音识别的方法和装置接收音频信号,产生音频信号的一系列帧,使用窄通带将音频信号的每一帧转换成一组窄带特征向量,将窄带特征向量耦合到 语音模型,并且确定音频信号是否是宽带信号。 当音频信号被确定为宽带信号时,针对每个帧产生在窄通带外部的一个或多个通带中的每一个的通带参数,并且一个或多个频带能量参数耦合到语音模型。

    Speech recognition by dynamical noise model adaptation
    4.
    发明授权
    Speech recognition by dynamical noise model adaptation 有权
    动态噪声模型适应的语音识别

    公开(公告)号:US06950796B2

    公开(公告)日:2005-09-27

    申请号:US10007886

    申请日:2001-11-05

    CPC分类号: G10L15/20 G10L2021/02168

    摘要: The invention provides a Hidden Markov Model (132) based automated speech recognition system (100) that dynamically adapts to changing background noise by detecting long pauses in speech, and for each pause processing background noise during the pause to extract a feature vector that characterizes the background noise, identifying a Gaussian mixture component of noise states that most closely matches the extracted feature vector, and updating the mean of the identified Gaussian mixture component so that it more closely matches the extracted feature vector, and consequently more closely matches the current noise environment. Alternatively, the process is also applied to refine the Gaussian mixtures associated with other emitting states of the Hidden Markov Model.

    摘要翻译: 本发明提供了一种基于隐马尔可夫模型(132)的自动化语音识别系统(100),其通过检测语音中的长暂停动态地适应变化的背景噪声,并且对于暂停期间的每个暂停处理背景噪声来提取表征 背景噪声,识别与提取的特征向量最紧密匹配的噪声状态的高斯混合分量,以及更新所识别的高斯混合分量的平均值,使得其与提取的特征向量更紧密地匹配,并且因此更紧密地匹配当前噪声环境 。 或者,该过程也被应用于改进与隐马尔可夫模型的其他发射状态相关联的高斯混合。

    Polyphone network method and apparatus
    5.
    发明授权
    Polyphone network method and apparatus 有权
    Polyphone网络方法和装置

    公开(公告)号:US07319958B2

    公开(公告)日:2008-01-15

    申请号:US10365820

    申请日:2003-02-13

    IPC分类号: G10L15/04

    摘要: Acoustic phones (preferably drawn 12 from a plurality of spoken languages) are provided 11. A hierarchically-organized polyphone network (20) organizes views of these phones of varying resolution and phone categorization as a function, at least in part, of phonetic similarity (14) and at least one language-independent phonological factor (15). In a preferred approach, a unique transcription system serves to represent the phones using only standard, printable ASCII characters, none of which comprises a special character (such as those characters that have a command significance for common script interpreters such as the UNIX command line).

    摘要翻译: 提供声音电话(优选地从多个口语中抽出12)11.分层组织的多声道网络(20)将不同分辨率和电话分类的这些电话的视图组织为至少部分语音相似性( 14)和至少一种与语言无关的语音因素(15)。 在一种优选的方法中,唯一的转录系统用于仅使用标准的可打印的ASCII字符来表示电话,这些字符都不包括特殊字符(例如那些对诸如UNIX命令行的常见脚本解释器具有命令意义的字符) 。

    METHOD AND SYSTEM FOR PERSONALIZED VOICE DIALOGUE
    6.
    发明申请
    METHOD AND SYSTEM FOR PERSONALIZED VOICE DIALOGUE 审中-公开
    用于个性化语音对话的方法和系统

    公开(公告)号:US20080080678A1

    公开(公告)日:2008-04-03

    申请号:US11536854

    申请日:2006-09-29

    IPC分类号: H04M11/00

    CPC分类号: H04M3/4936 G10L2015/226

    摘要: A method (10) and system (200) for personalized voice dialogue can include tracking (12) a user's use of voice dialogue states or transitions and progressively offering (16) a user more efficient voice dialogue transitions or states such as voice dialogue transition or states having fewer and fewer words. The tracking of dialog states or transitions can include tracking (14) of repeated use of the dialogue states or transitions. A user can be prompted to create a new transition or state. The prompting (18) and confirmation and verification (20) by the user of a new transition or state can be done using SCXML language. The method can further include instantiating (21) the new transition or state with voice tags or words and performing (22) speech recognition using the new transition or state. The method can again determine (23) if the new transition or state is a repeat transition or state.

    摘要翻译: 用于个性化语音对话的方法(10)和系统(200)可以包括跟踪(12)用户对语音对话状态或转换的使用,并逐渐提供(16)用户更有效的语音对话转换或状态,例如语音对话转换或 状态越来越少的单词。 跟踪对话状态或转换可以包括跟踪(14)重复使用对话状态或转换。 可以提示用户创建新的转换或状态。 用户可以使用SCXML语言完成新的转换或状态的提示(18)和确认(20)。 该方法还可以包括使用语音标签或单词实例化(21)新的转换或状态,并使用新的转换或状态执行(22)语音识别。 该方法可以再次确定(23)如果新的转换或状态是重复转换或状态。

    Method and Apparatus to Facilitate Conforming a Wireless Personal Communications Device to a Local Social Standard
    7.
    发明申请
    Method and Apparatus to Facilitate Conforming a Wireless Personal Communications Device to a Local Social Standard 审中-公开
    促进将无线个人通信设备符合本地社会标准的方法和装置

    公开(公告)号:US20080207125A1

    公开(公告)日:2008-08-28

    申请号:US11679704

    申请日:2007-02-27

    IPC分类号: H04B7/00 H04B7/24 H04M1/00

    摘要: A wireless transmitter (201) transmits (102) a message intended for at least one wireless personal communications device (202). That message comprises content (203) configured and arranged to at least attempt to prompt a particular operability configuration for the wireless personal communications device that conforms to social standards as correspond to a given local venue (204). Such content can vary with the application setting with some relevant examples comprising, but not being limited to, information indicative of a degree to which the operability configuration comprises a required operability configuration (as versus a voluntary or merely suggested configuration), information indicative of at least one particular capability of the wireless personal communication device to which the operability configuration pertains, and/or information corresponding to a time frame during which the operability configuration is applicable, to note but a few.

    摘要翻译: 无线发射机(201)发送(102)用于至少一个无线个人通信设备(202)的消息。 该消息包括被配置和布置为至少尝试针对符合给定的本地场所(204)的符合社会标准的无线个人通信设备提示特定可操作性配置的内容(203)。 这样的内容可以随着应用设置而变化,其中一些相关示例包括但不限于指示可操作性配置包括所需可操作性配置(与自愿或仅仅建议的配置相关)的程度的信息,指示在 可操作性配置所属的无线个人通信设备的至少一个特定能力和/或与可操作性配置可应用的时间帧相对应的信息注意到少数。

    METHOD AND SYSTEM FOR A USER INTERFACE USING HIGHER ORDER COMMANDS
    8.
    发明申请
    METHOD AND SYSTEM FOR A USER INTERFACE USING HIGHER ORDER COMMANDS 审中-公开
    使用更高级命令的用户界面的方法和系统

    公开(公告)号:US20080114604A1

    公开(公告)日:2008-05-15

    申请号:US11560139

    申请日:2006-11-15

    IPC分类号: G10L21/00

    摘要: A Higher Order Command Dialog System (HOCS) 250 for enabling voice control to a user interface is provided. The HOCS can record (302) a sequence of action steps a user performs while navigating a menu system to perform a task, prompt (304) a user to create an HOC for the task, and associate (306) the sequence of actions steps with a Higher Order Command (HOC) for performing the task. The HOC can include multi-modal inputs (120/260) and prompt a user for non-specific additional information (124) required in performing the task. The HOCS can store the HOC as a voice tag or a user-input command.

    摘要翻译: 提供了一种用于对用户界面进行语音控制的高阶命令对话系统(HOCS)250。 HOCS可以记录(302)用户在导航菜单系统以执行任务时执行的一系列动作步骤,提示(304)用户创建用于该任务的HOC,并将(306)动作步骤序列与 执行任务的高阶命令(HOC)。 HOC可以包括多模式输入(120/260),并提示用户执行任务所需的非特定附加信息(124)。 HOCS可以将HOC存储为语音标签或用户输入命令。