Corporate voice dialing with shared directories
    1.
    发明授权
    Corporate voice dialing with shared directories 失效
    公司语音拨号与共享目录

    公开(公告)号:US5924070A

    公开(公告)日:1999-07-13

    申请号:US870373

    申请日:1997-06-06

    摘要: Voice-controlled customized commands including customization of the command to be preformed, such as a number to be dialed to make a connection with an address of a corporate voice dialing system, and the speech pattern or utterance which may be enrolled by a user to invoke the command can be used by other users, if authorized by the enrolling user. When a current user wants to use a customized command enrolled by another user, a preferably voice actuated command is invoked to cause the search of a database containing a page of customized commands for each user and the return of commands to which access of a current user is authorized in accordance with aliases established by the enrolling user. The returned commands are preferably presented to the current user as a menu from which the current user can make a selection and obtain execution of the authorized command.

    摘要翻译: 语音控制的定制命令,包括定制要执行的命令,例如要拨打的号码以与公司语音拨号系统的地址进行连接,以及可由用户注册的语音模式或话语 该命令可由其他用户使用,如果由注册用户授权。 当当前用户希望使用由另一用户登记的定制命令时,调用优选语音激活命令,以引起对包含每个用户的定制命令页面的数据库的搜索,以及返回当前用户的访问权限 根据登记用户建立的别名进行授权。 返回的命令优选地作为当前用户可以进行选择并获得授权命令的执行的菜单呈现给当前用户。

    System and method of using pre-enrolled speech sub-units for efficient
speech synthesis
    2.
    发明授权
    System and method of using pre-enrolled speech sub-units for efficient speech synthesis 失效
    使用预先注册的语音子单元进行有效的语音合成的系统和方法

    公开(公告)号:US06041300A

    公开(公告)日:2000-03-21

    申请号:US821520

    申请日:1997-03-21

    IPC分类号: G10L15/22 H04M1/27 G10L5/04

    CPC分类号: G10L15/22 H04M1/271

    摘要: A speech recognition system is disclosed useful in, for example, hands-free voice telephone dialing applications. The system will match a spoken word (token) to one previously enrolled in the system. The system will thereafter synthesize or replay the recognized word so that the speaker can confirm that the recognized word is indeed the correct word before further action is taken. In the case of voice activated dialing, this avoids wrong numbers. The token itself is not explicitly recorded; rather, only the lefemes may be recorded from which the token can be reconstructed for playback. This greatly reduces the amount of disk space that is needed for the database as well as provides the ability to reconstruction data in real time for synthesis use by a local name recognition machine.

    摘要翻译: 公开了一种用于例如免提语音电话拨号应用的语音识别系统。 系统将匹配一个口语单词(令牌)与先前注册在系统中的单词(令牌)。 该系统随后将合成或重放所识别的单词,使得在采取进一步的动作之前,扬声器可以确认识别的单词确实是正确的单词。 在语音激活拨号的情况下,这避免了错误的数字。 令牌本身没有明确记录; 相反,仅可以记录lefemes,从中可以重建令牌以进行重放。 这大大减少了数据库所需的磁盘空间量,并提供了实时重建数据以便本地名称识别机进行综合使用的能力。

    Apparatus and methods for user identification to deny access or service to unauthorized users
    3.
    发明授权
    Apparatus and methods for user identification to deny access or service to unauthorized users 失效
    用于用户识别的设备和方法,以拒绝对未经授权的用户的访问或服务

    公开(公告)号:US06246751B1

    公开(公告)日:2001-06-12

    申请号:US08908121

    申请日:1997-08-11

    IPC分类号: H04M164

    摘要: Apparatus for preventing unauthorized use of a voice dialing system and, particularly, a call forwarding feature associated with the system whereby system users may forward a telephone number respectively associated therewith to a remote location in order to receive phone calls at the remote location, comprises: a database for pre-storing telephone numbers of system users and for pre-storing acoustic models respectively representative of speech associated with each system user, the acoustic models respectively corresponding to the telephone numbers; and a speaker identification module operatively coupled to the database for obtaining and decoding a speech sample from a potential system user during the potential users' attempt to make a telephone call, the speaker identification module comparing the decoded speech sample obtained with the pre-stored acoustic model associated with the telephone number dialed by the potential user; whereby if the decoded speech sample substantially matches the pre-stored acoustic model, then the phone call attempted by the potential user is terminated.

    摘要翻译: 用于防止未授权使用语音拨号系统的设备,特别是与该系统相关联的呼叫转移功能,由此系统用户可以将分别与之相关联的电话号码转发到远程位置以便在远程位置接收电话呼叫,包括: 数据库,用于预先存储系统用户的电话号码,以及预先存储分别代表与每个系统用户相关联的语音的声学模型,分别对应于电话号码的声学模型; 以及可操作地耦合到数据库的扬声器识别模块,用于在潜在用户尝试进行电话呼叫期间从潜在的系统用户获得和解码语音样本,所述扬声器识别模块将获得的解码语音样本与预先存储的声学 与潜在用户拨打的电话号码相关联的模型; 由此如果解码语音样本基本上与预先存储的声学模型匹配,则终止潜在用户尝试的电话呼叫。

    Apparatus and methods for shift invariant speech recognition
    4.
    发明授权
    Apparatus and methods for shift invariant speech recognition 失效
    用于移位不变语音识别的装置和方法

    公开(公告)号:US5956671A

    公开(公告)日:1999-09-21

    申请号:US868860

    申请日:1997-06-04

    CPC分类号: G10L15/02

    摘要: The present invention includes a method of generating a set of substantially shift invariant acoustic features from an input speech signal which comprises the steps of: splitting the input speech signal into a plurality of input speech signals; respectively delaying a majority of the input speech signals by a successively incrementing time interval; respectively extracting a plurality of sets of acoustic features from the plurality of input speech signals; summing the plurality of sets of acoustic features to form a set of summed acoustic features; and dividing the set of summed acoustic features by a number equivalent to the number of sets of acoustic features summed in the summing step thereby forming a set of averaged acoustic features which are substantially shift invariant. Further, the present invention may include a method for generating at least one substantially shift invariant speech recognition model from speech training data which comprises the steps of: inputting the speech training data a first time; extracting acoustic features from the speech training data input the first time; inputting the speech training data a plurality of times thereafter, each time respectively delaying the input speech training data by a successively incrementing time interval; respectively extracting acoustic features from each delayed speech training data input each time; and utilizing at least the acoustic features extracted in the extracting steps to form the at least one speech recognition model which is substantially shift invariant. Still further, the present invention may include a synchrosqueezing process in the feature extraction steps. Also, the invention contemplates implementing these processes individually, in combination with another of the processes, and a combination of all the processes.

    摘要翻译: 本发明包括一种从输入语音信号产生一组基本上移位的不变声学特征的方法,包括以下步骤:将输入语音信号分成多个输入语音信号; 分别延迟大部分输入语音信号的连续递增时间间隔; 分别从多个输入语音信号提取多组声学特征; 将多组声学特征相加以形成一组相加的声学特征; 并且将所述总和声学特征的集合除以等于在求和步骤中相加的声学特征的集合的数量的数目,从而形成基本上不变的基本平移的声学特征的集合。 此外,本发明可以包括用于从语音训练数据生成至少一个基本上移位的不变语音识别模型的方法,该方法包括以下步骤:首次输入语音训练数据; 首先从语音训练数据输入中提取声学特征; 每次分别将输入语音训练数据延迟连续递增的时间间隔,多次输入语音训练数据; 分别从每个延迟的语音训练数据输入中提取声学特征; 以及至少利用在所述提取步骤中提取的声学特征来形成基本上不变的所述至少一个语音识别模型。 此外,本发明可以包括特征提取步骤中的同步挤压过程。 此外,本发明还考虑与另一个过程相结合地实现这些过程,以及所有过程的组合。

    Speech recognition using thresholded speaker class model selection or
model adaptation
    6.
    发明授权
    Speech recognition using thresholded speaker class model selection or model adaptation 失效
    使用阈值语音员模型选择或模型适应的语音识别

    公开(公告)号:US5895447A

    公开(公告)日:1999-04-20

    申请号:US787031

    申请日:1997-01-28

    IPC分类号: G10L15/06 G10L21/02 G01L9/06

    摘要: Clusters of quantized feature vectors are processed against each other using a threshold distance value to cluster mean values of sets of parameters contained in speaker specific codebooks to form classes of speakers against which feature vectors computed from an arbitrary input speech signal can be compared to identify a speaker class. The number of codebooks considered in the comparison may be thus reduced to limit mixture elements which engender ambiguity and reduce system response speed when the speaker population becomes large. A speaker class processing model which is speaker independent within the class may be trained on one or more members of the class and selected for implementation in a speech recognition processor in accordance with the speaker class recognized to further improve speech recognition to level comparable to that of a speaker dependent model. Formation of speaker classes can be supervised by identification of groups of speakers to be included in the class and the speaker class dependent model trained on members of a respective group.

    摘要翻译: 使用阈值距离值对量化特征向量的群集进行处理,以对包含在说话人专用码本中的参数集合的平均值进行聚类,以形成可从任意输入语音信号计算出的特征矢量的类型的扬声器, 讲话班 因此,比较中考虑的码本的数量可以减少,以限制当扬声器群体变大时引起歧义并降低系统响应速度的混合元素。 可以在课堂中独立于扬声器的扬声器类处理模型可以在类的一个或多个成员上进行训练,并且根据所识别的扬声器类被选择用于在语音识别处理器中实现,以进一步将语音识别提升到与 一个说话者依赖模型。 可以通过识别要包括在课堂中的演讲者组和对相应组的成员进行培训的演讲者类依赖模式来监督演讲者组的形成。