Automated hotel attendant using speech recognition
    41.
    发明授权
    Automated hotel attendant using speech recognition 失效
    自动化酒店服务员使用语音识别

    公开(公告)号:US06314165B1

    公开(公告)日:2001-11-06

    申请号:US09070399

    申请日:1998-04-30

    IPC分类号: H04M164

    CPC分类号: G10L15/26

    摘要: An automated hotel attendant is provided for coordinating room-to-room calling over a telephone switching system that supports multiple telephone extensions. A hotel registration system receives and stores the spelled names of hotel guests as well as assigns each guest an associated telephone extension. A lexicon training system is connected to the hotel registration system for generating pronunciations for each spelled name by converting the characters that spell those names into word-phoneme data. This word-phoneme data is in turn stored in a lexicon that is used by a speech recognition system. In particular, a phoneticizer in conjunction with a Hidden Markov Model (HMM) based model trainer serves as the basis for the lexicon training system, such that one or several HMM models associated with each guest name are stored in the lexicon. An automated attendant is coupled to the speech recognition system for converting a spoken name of a hotel guest entered from one of the telephone extensions into a predefined hotel guest name that can be used to retrieve an assigned telephone extension from the hotel registration system. Next, the automated attendant causes the telephone switching system to call the requested telephone extension in response to the entry of the spoken name from one of the telephone extensions.

    摘要翻译: 提供一个自动化酒店服务人员通过支持多个电话分机的电话交换系统协调房间到房间的通话。 酒店注册系统接收并存储酒店客人的拼写名称,并为每个客人分配相关的电话分机。 词典训练系统连接到酒店注册系统,用于通过将拼写这些名称的字符转换为字音素数据来产生每个拼写名称的发音。 该字音素数据又被存储在由语音识别系统使用的词典中。 特别地,与基于隐马尔可夫模型(HMM)的模型训练器相结合的语音提供者用作词典训练系统的基础,使得与每个来宾姓名相关联的一个或多个HMM模型被存储在词典中。 自动话务员被耦合到语音识别系统,用于将从一个电话分机输入的酒店客人的口语名称转换成可用于从酒店注册系统检索分配的电话分机的预定义酒店客人姓名。 接下来,自动话务员使得电话交换系统响应于从电话分机之一输入口语名称来呼叫所请求的电话分机。

    Supervised adaptation using corrective N-best decoding
    42.
    发明授权
    Supervised adaptation using corrective N-best decoding 失效
    使用校正N最佳解码的监督适应

    公开(公告)号:US06272462B1

    公开(公告)日:2001-08-07

    申请号:US09257893

    申请日:1999-02-25

    IPC分类号: G10L1506

    CPC分类号: G10L15/075 G10L2015/0635

    摘要: Supervised adaptation speech is supplied to the recognizer and the recognizer generates the N-best transcriptions of the adaptation speech. These transcriptions include the one transcription known to be correct, based on a priori knowledge of the adaptation speech, and the remaining transcriptions known to be incorrect. The system applies weights to each transcription: a positive weight to the correct transcription and negative weights to the incorrect transcriptions. These weights have the effect of moving the incorrect transcriptions away from the correct one, rendering the recognition system more discriminative for the new speaker's speaking characteristics. Weights applied to the incorrect solutions are based on the respective likelihood scores generated by the recognizer. The sum of all weights (positive and negative) are a positive number. This ensures that the system will converge.

    摘要翻译: 受监督的适应语音被提供给识别器,并且识别器生成适应语音的N个最佳的转录。 这些转录包括基于适应言语的先验知识的已知正确的一个转录,以及已知不正确的剩余转录。 该系统对每个转录应用权重:对正确转录的正负重和不正确转录的负权重。 这些权重具有将错误的记录从正确的转录中移开的效果,使识别系统对于新的说话者的说话特征更具歧视性。 应用于不正确解的权重是基于识别器产生的各自的可能性得分。 所有权重(正和负)的和是正数。 这样可以确保系统收敛。

    Method and apparatus using decision trees to generate and score multiple
pronunciations for a spelled word
    43.
    发明授权
    Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word 失效
    使用决策树生成和评分拼写单词的多个发音的方法和设备

    公开(公告)号:US6016471A

    公开(公告)日:2000-01-18

    申请号:US67764

    申请日:1998-04-29

    IPC分类号: G10L13/08 G10L5/04

    CPC分类号: G10L13/08

    摘要: The mixed decision tree includes a network of yes-no questions about adjacent letters in a spelled word sequence and also about adjacent phonemes in the phoneme sequence corresponding to the spelled word sequence. Leaf nodes of the mixed decision tree provide information about which phonetic transcriptions are most probable. Using the mixed trees, scores are developed for each of a plurality of possible pronunciations, and these scores can be used to select the best pronunciation as well as to rank pronunciations in order of probability. The pronunciations generated by the system can be used in speech synthesis and speech recognition applications as well as lexicography applications.

    摘要翻译: 混合决策树包括关于拼写字序列中的相邻字母的是 - 否问题的网络,并且还包括与拼写单词序列相对应的音素序列中的相邻音素。 混合决策树的叶节点提供了哪些语音转录最有可能的信息。 使用混合树,为多个可能的发音中的每一个开发分数,并且这些分数可以用于选择最佳发音以及按概率的排序排列发音。 系统生成的发音可用于语音合成和语音识别应用以及词典应用。

    Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing
    44.
    发明授权
    Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing 有权
    语音标记,语音注释和可选后置处理的便携式设备的语音识别

    公开(公告)号:US07324943B2

    公开(公告)日:2008-01-29

    申请号:US10677174

    申请日:2003-10-02

    IPC分类号: G10L21/00 H04N5/76

    CPC分类号: G06F17/30796 G10L15/26

    摘要: A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.

    摘要翻译: 媒体捕获设备具有接收与媒体捕获活动紧密相关的媒体捕获活动的用户语音的音频输入。 分别与媒体捕获活动相关的多个聚焦语音识别词典被存储在设备上,并且语音识别器基于所选择的一个焦点语音识别词典识别用户语音。 媒体标签器使用生成的语音识别文本来标记捕获的媒体,并且媒体注释器用适合于输入到语音识别器的用户语音的样本来注释所捕获的媒体。 标记和注释是基于用户语音的接收和捕获的媒体的捕获之间的紧密的时间关系。 在后期处理中,注释可以转换为标签,用于使用字母对声音规则和拼写单词输入来编辑词典,或直接与语音匹配以检索所捕获的媒体。

    Portable device for enhanced security and accessibility
    45.
    发明授权
    Portable device for enhanced security and accessibility 有权
    便携式设备,增强安全性和可访问性

    公开(公告)号:US07249025B2

    公开(公告)日:2007-07-24

    申请号:US10435144

    申请日:2003-05-09

    IPC分类号: G10L15/22

    CPC分类号: G06F3/16 G10L15/26 G10L17/00

    摘要: A portable device increases user access to equipment utilizing a communications interface providing communication with the equipment in accordance with various, combinable embodiments. In one embodiment, a speech generator generates speech based on commands relating to equipment operation, which may be received from the equipment via the communications interface. A selection mechanism allows the user to select commands and thereby operate the equipment. In another embodiment, a command navigator navigates commands based on user input by shifting focus between commands, communicates a command having the focus to the speech generator, and allows the user to select a command. In a further embodiment, a phoneticizer converts the commands and/or predetermined navigation and selection options into a dynamic speech lexicon, and a speech recognizer uses the lexicon to recognize a user navigation input and/or user selection of a command. Speaker verification can also be used to enhance security using a speech biometric.

    摘要翻译: 便携式设备根据各种可组合的实施例,利用通信接口增加用户对设备的访问,从而提供与设备的通信。 在一个实施例中,语音生成器基于与设备操作相关的命令来生成语音,该设备操作可以经由通信接口从设备接收。 选择机构允许用户选择命令,从而操作设备。 在另一个实施例中,命令导航器通过在命令之间移动焦点,基于用户输入来导航命令,将具有焦点的命令传达给语音生成器,并且允许用户选择命令。 在另一实施例中,音标器将命令和/或预定的导航和选择选项转换为动态语音词典,并且语音识别器使用词典来识别用户导航输入和/或用户对命令的选择。 扬声器验证也可以用来增强使用语音生物特征的安全性。

    Multimodal concierge for secure and convenient access to a home or building
    46.
    发明授权
    Multimodal concierge for secure and convenient access to a home or building 有权
    多式联运礼宾,可方便地进入住宅或建筑物

    公开(公告)号:US07064652B2

    公开(公告)日:2006-06-20

    申请号:US10237758

    申请日:2002-09-09

    IPC分类号: G05B19/00 G10L11/00 G10L17/00

    CPC分类号: H04L63/0861 G07C9/00158

    摘要: An improved method is provided for enrolling with a resource security system. The method includes: providing an access code to a system user; accessing the resource security system using the access code; prompting the user to input a biometric feature which identifies the user; capturing a biometric feature associated with the user; and associating the captured biometric feature with the identity of the user for subsequent verification. The method further includes subsequently granting access to the secured resource based on biometric feature data input by the user.

    摘要翻译: 提供了一种用于登记资源安全系统的改进方法。 该方法包括:向系统用户提供访问代码; 使用访问代码访问资源安全系统; 提示用户输入识别用户的生物特征; 捕获与用户相关联的生物特征特征; 并将所捕获的生物测定特征与用户的身份相关联以用于随后的验证。 该方法还包括随后基于用户输入的生物特征数据授予对安全资源的访问。

    Method and apparatus for improved speech recognition with supplementary information
    47.
    发明授权
    Method and apparatus for improved speech recognition with supplementary information 有权
    用于通过补充信息改进语音识别的方法和装置

    公开(公告)号:US06983244B2

    公开(公告)日:2006-01-03

    申请号:US10652146

    申请日:2003-08-29

    IPC分类号: G10L15/22

    摘要: A method for improving recognition results of a speech recognizer uses supplementary information to confirm recognition results. A user inputs speech to a speech recognizer. The speech recognizer resides on a mobile device or on a server at a remote location. The speech recognizer determines a recognition result based on the input speech. A confidence measure is calculated for the recognition result. If the confidence measure is below a threshold, the user is prompted for supplementary data. The supplementary data is determined dynamically based on ambiguities between the input speech and the recognition result, wherein the supplementary data will distinguish the input speech over potential incorrect results. The supplementary data may be a subset of alphanumeric characters that comprise the input speech, or other data associated with a desired result, such as an area code or location. The user may provide the supplementary data verbally, or manually using a keypad, touchpad, touchscreen, or stylus pen.

    摘要翻译: 用于改善语音识别器的识别结果的方法使用补充信息来确认识别结果。 用户向语音识别器输入语音。 语音识别器驻留在移动设备或远程位置的服务器上。 语音识别器基于输入语音来确定识别结果。 计算识别结果的置信度量。 如果置信度量值低于阈值,则会提示用户提供补充数据。 基于输入语音和识别结果之间的模糊度来动态地确定补充数据,其中补充数据将通过潜在的不正确结果区分输入语音。 补充数据可以是组成输入语音的字母数字字符的子集,或与期望结果相关联的其他数据,例如区域代码或位置。 用户可以口头提供补充数据,或者使用键盘,触摸板,触摸屏或触控笔手动提供补充数据。

    Distributed apparatus to improve safety and communication for law enforcement applications
    48.
    发明授权
    Distributed apparatus to improve safety and communication for law enforcement applications 失效
    分布式设备,用于改善执法应用的安全和通信

    公开(公告)号:US06952164B2

    公开(公告)日:2005-10-04

    申请号:US10287954

    申请日:2002-11-05

    IPC分类号: G07B15/02 G08B21/00

    CPC分类号: G07B15/00

    摘要: A wearable, computerized apparatus for use with law enforcement has an evidence collector adapted to collect evidentiary information of a type collected according to law enforcement procedures and useful for identification of a suspect. It further has a safety monitor adapted to collect safety information relating to well-being of an officer. A wireless communications link communicates the evidentiary information and the safety information to a centralized component of a distributed communications system to assist in identifying suspects and dispatching assistance.

    摘要翻译: 用于执法的可佩戴的计算机化装置有一个证据收集器,用于收集根据执法程序收集的类型的证据信息,并有助于识别嫌疑人。 它还有一个安全监视器,适用于收集有关人员福祉的安全信息。 无线通信链路将证据信息和安全信息传送到分布式通信系统的集中式组件,以帮助识别嫌疑人和发送协助。

    Personalized agent for portable devices and cellular phone
    49.
    发明授权
    Personalized agent for portable devices and cellular phone 有权
    便携式设备和手机的个性化代理

    公开(公告)号:US06895257B2

    公开(公告)日:2005-05-17

    申请号:US10077904

    申请日:2002-02-18

    摘要: Personalized agent services are provided in a personal messaging device, such as a cellular telephone or personal digital assistant, through services of a speech recognizer that converts speech into text and a text-to-speech synthesizer that converts text to speech. Both recognizer and synthesizer may be server-based or locally deployed within the device. The user dictates an e-mail message which is converted to text and stored. The stored text is sent back to the user as text or as synthesized speech, to allow the user to edit the message and correct transcription errors before sending as e-mail. The system includes a summarization module that prepares short summaries of incoming e-mail and voice mail. The user may access these summaries, and retrieve and organize email and voice mail using speech commands.

    摘要翻译: 通过将语音转换为文本的语音识别器的服务和将文本转换为语音的文本到语音合成器,个性化代理服务被提供在诸如蜂窝电话或个人数字助理的个人消息设备中。 识别器和合成器可以是基于服务器的或本地部署在设备内。 用户指定一个电子邮件消息,转换为文本并存储。 存储的文本作为文本或合成语音发送回用户,以允许用户在作为电子邮件发送之前编辑消息并纠正转录错误。 该系统包括一个汇总模块,准备收到的电子邮件和语音邮件的简要摘要。 用户可以访问这些摘要,并使用语音命令检索和组织电子邮件和语音邮件。

    Method and apparatus for improved speech recognition with supplementary information
    50.
    发明申请
    Method and apparatus for improved speech recognition with supplementary information 有权
    用于通过补充信息改进语音识别的方法和装置

    公开(公告)号:US20050049860A1

    公开(公告)日:2005-03-03

    申请号:US10652146

    申请日:2003-08-29

    摘要: A method for improving recognition results of a speech recognizer uses supplementary information to confirm recognition results. A user inputs speech to a speech recognizer. The speech recognizer resides on a mobile device or on a server at a remote location. The speech recognizer determines a recognition result based on the input speech. A confidence measure is calculated for the recognition result. If the confidence measure is below a threshold, the user is prompted for supplementary data. The supplementary data is determined dynamically based on ambiguities between the input speech and the recognition result, wherein the supplementary data will distinguish the input speech over potential incorrect results. The supplementary data may be a subset of alphanumeric characters that comprise the input speech, or other data associated with a desired result, such as an area code or location. The user may provide the supplementary data verbally, or manually using a keypad, touchpad, touchscreen, or stylus pen.

    摘要翻译: 用于改善语音识别器的识别结果的方法使用补充信息来确认识别结果。 用户向语音识别器输入语音。 语音识别器驻留在移动设备或远程位置的服务器上。 语音识别器基于输入语音来确定识别结果。 计算识别结果的置信度量。 如果置信度量值低于阈值,则会提示用户提供补充数据。 基于输入语音和识别结果之间的模糊度来动态地确定补充数据,其中补充数据将通过潜在的不正确结果区分输入语音。 补充数据可以是组成输入语音的字母数字字符的子集,或与期望结果相关联的其他数据,例如区域代码或位置。 用户可以口头提供补充数据,或者使用键盘,触摸板,触摸屏或触控笔手动提供补充数据。