Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets
    1.
    发明授权
    Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets 有权
    使用减少的脚本和预录制的语音资源构建级联TTS语音时减少录制时间

    公开(公告)号:US08019605B2

    公开(公告)日:2011-09-13

    申请号:US11748256

    申请日:2007-05-14

    IPC分类号: G10L13/08 G10L13/06

    CPC分类号: G10L13/04

    摘要: The present invention discloses a system and a method for creating a reduced script, which is read by a voice talent to create a concatenative text-to-speech (TTS) voice. The method can automatically process pre-recorded audio to derive speech assets for a concatenative TTS voice. The pre-recording audio can include sets of recorded phrases used by a speech user interface (Sill). A set of unfulfilled speech assets needed for foil phonetic coverage of the concatenative TTS voice can be determined. A reduced script can be constructed that includes a set of phrases, which when read by a voice talent result in a reduced corpus. When the reduced corpus is automatically processed, a reduced set of speech assets result. The reduced set includes each of the unfulfilled speech assets. When this reduced corpus is combined with existing speech assets the result will be a voice with a complete set of speech assets.

    摘要翻译: 本发明公开了一种用于创建简化脚本的系统和方法,该脚本由语音天才读取以创建级联的文本到语音(TTS)语音。 该方法可以自动处理预先录制的音频,以便为连续的TTS语音导出语音资源。 预录音音频可以包括由语音用户界面(Sill)使用的记录短语集合。 可以确定一连串的TTS语音的箔语音覆盖所需的一组未实现的语音资产。 可以构造一个简化的脚本,其包括一组短语,当通过语音天赋读取时,会产生减少的语料库。 当自动处理缩减的语料库时,会产生一组减少的语音资源。 缩减的集合包括每个未实现的语音资产。 当这种减少的语料库与现有语音资源相结合时,结果将是具有完整语音资产的语音。

    Adjusting a speech engine for a mobile computing device based on background noise
    2.
    发明授权
    Adjusting a speech engine for a mobile computing device based on background noise 有权
    基于背景噪声调整移动计算设备的语音引擎

    公开(公告)号:US09076454B2

    公开(公告)日:2015-07-07

    申请号:US13358097

    申请日:2012-01-25

    IPC分类号: G10L15/20 G10L21/0208

    CPC分类号: G10L21/0208 G10L15/20

    摘要: Methods, apparatus, and products are disclosed for adjusting a speech engine for a mobile computing device based on background noise, the mobile computing device operatively coupled to a microphone, that include: sampling, through the microphone, background noise for a plurality of operating environments in which the mobile computing device operates; generating, for each operating environment, a noise model in dependence upon the sampled background noise for that operating environment; and configuring the speech engine for the mobile computing device with the noise model for the operating environment in which the mobile computing device currently operates.

    摘要翻译: 公开了用于基于背景噪声调整用于移动计算设备的语音引擎的方法,装置和产品,该移动计算设备可操作地耦合到麦克风,其包括:通过麦克风对多个操作环境的背景噪声进行采样 其中移动计算设备运行; 根据所述操作环境的采样背景噪声,为每个操作环境产生噪声模型; 以及为移动计算设备当前操作的操作环境的噪声模型配置移动计算设备的语音引擎。

    Enhancing media playback with speech recognition
    3.
    发明授权
    Enhancing media playback with speech recognition 有权
    通过语音识别增强媒体播放

    公开(公告)号:US08478592B2

    公开(公告)日:2013-07-02

    申请号:US12180583

    申请日:2008-07-28

    申请人: Paritosh D. Patel

    发明人: Paritosh D. Patel

    摘要: A method for enhancing a media file to enable speech-recognition of spoken navigation commands can be provided. The method can include receiving a plurality of textual items based on subject matter of the media file and generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine. The method can further include associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar. The method can further include associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.

    摘要翻译: 可以提供用于增强媒体文件以实现语音导航命令的语音识别的方法。 该方法可以包括基于媒体文件的主题接收多个文本项目并为每个文本项目生成语法,从而生成多个用于语音识别引擎使用的语法。 该方法还可以包括将时间戳与每个语法相关联,其中时间戳表示与语法相对应的文本项的媒体文件中的位置。 该方法还可以包括将多个语法与媒体文件相关联,使得由语音识别引擎识别的语音与媒体文件中的对应位置相关联。

    Partially filling mixed-initiative forms from utterances having sub-threshold confidence scores based upon word-level confidence data
    4.
    发明授权
    Partially filling mixed-initiative forms from utterances having sub-threshold confidence scores based upon word-level confidence data 有权
    从基于词级置信度数据的具有子阈值置信度得分的话语部分填充混合主动形式

    公开(公告)号:US07870000B2

    公开(公告)日:2011-01-11

    申请号:US11692741

    申请日:2007-03-28

    IPC分类号: G10L15/16

    CPC分类号: G10L15/22 G10L15/193

    摘要: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.

    摘要翻译: 本公开涉及提示提供多个元素的输入的口头响应。 可以接收包括多个元素的内容的单个语音话语,其中每个元素被映射到数据字段。 讲话语音可以是语音到文本转换,以导出每个多个元素的值。 可以确定话语等级置信度得分,其可以低于相关的确定性阈值。 然后可以确定每个派生元素的元素级置信度得分。 多个元素的第一组可以具有高于相关确定性阈值的元素级置信度得分,而第二组可以具有下面的得分。 值可以存储在映射到第一组的数据字段中。 可以播放第二组的输入提示。

    USING FINITE STATE GRAMMARS TO VARY OUTPUT GENERATED BY A TEXT-TO-SPEECH SYSTEM
    5.
    发明申请
    USING FINITE STATE GRAMMARS TO VARY OUTPUT GENERATED BY A TEXT-TO-SPEECH SYSTEM 审中-公开
    使用有限状态的灰度对文本到语音系统产生的变化的输出

    公开(公告)号:US20080312929A1

    公开(公告)日:2008-12-18

    申请号:US11761852

    申请日:2007-06-12

    IPC分类号: G10L13/00

    CPC分类号: G10L13/027

    摘要: The present invention discloses a text-to-speech system that provides output variability. The system can include a finite state grammar, a variability engine and a text-to-speech engine. The finite state grammar can contain a phrase role consisting of one or more phrase elements. The phrase rule can deterministically generate a variable text phrase based upon at least one random number. The phrase rule can include a definition for each of the phrase elements. Each definition can be associated with at least one defined text string. The variability engine can construct a random text phrase responsive to receiving an action command, wherein said finite state grammar is used to create the text phrase. The variability engine can also rely on user-specified weights to adjust the output probabilities. The speech-to-text engine can convert the text phrase generated by the variability engine into speech output.

    摘要翻译: 本发明公开了一种提供输出变异性的文本到语音系统。 该系统可以包括有限状态语法,可变性引擎和文本到语音引擎。 有限状态语法可以包含由一个或多个短语元素组成的词组角色。 短语规则可以基于至少一个随机数确定地生成可变文本短语。 短语规则可以包括每个短语元素的定义。 每个定义可以与至少一个定义的文本字符串相关联。 可变性引擎可以响应于接收动作命令而构造随机文本短语,其中所述有限状态语法用于创建文本短语。 变异性引擎还可以依赖于用户指定的权重来调整输出概率。 语音对文本引擎可以将由可变性引擎生成的文本短语转换为语音输出。

    REDUCING RECORDING TIME WHEN CONSTRUCTING A CONCATENATIVE TTS VOICE USING A REDUCED SCRIPT AND PRE-RECORDED SPEECH ASSETS
    6.
    发明申请
    REDUCING RECORDING TIME WHEN CONSTRUCTING A CONCATENATIVE TTS VOICE USING A REDUCED SCRIPT AND PRE-RECORDED SPEECH ASSETS 有权
    使用减少的脚本和预先录制的语音资源构建语音TTS语音时减少录音时间

    公开(公告)号:US20080288256A1

    公开(公告)日:2008-11-20

    申请号:US11748256

    申请日:2007-05-14

    IPC分类号: G10L13/08

    CPC分类号: G10L13/04

    摘要: The present invention discloses a system and a method for creating a reduced script, which is read by a voice talent to create a concatenative text-to-speech (TTS) voice. The method can automatically process pre-recorded audio to derive speech assets for a concatenative TTS voice. The pre-recording audio can include sets of recorded phrases used by a speech user interface (Sill). A set of unfulfilled speech assets needed for foil phonetic coverage of the concatenative TTS voice can be determined. A reduced script can be constructed that includes a set of phrases, which when read by a voice talent result in a reduced corpus. When the reduced corpus is automatically processed, a reduced set of speech assets result. The reduced set includes each of the unfulfilled speech assets. When this reduced corpus is combined with existing speech assets the result will be a voice with a complete set of speech assets.

    摘要翻译: 本发明公开了一种用于创建简化脚本的系统和方法,该脚本由语音天才读取以创建级联的文本到语音(TTS)语音。 该方法可以自动处理预先录制的音频,以便为连续的TTS语音导出语音资源。 预录音音频可以包括由语音用户界面(Sill)使用的记录短语集合。 可以确定一连串的TTS语音的箔语音覆盖所需的一组未实现的语音资产。 可以构造一个简化的脚本,其包括一组短语,当通过语音天赋读取时,会产生减少的语料库。 当自动处理缩减的语料库时,会产生一组减少的语音资源。 缩减的集合包括每个未实现的语音资产。 当这种减少的语料库与现有语音资源相结合时,结果将是具有完整语音资产的语音。

    SYSTEM AND METHOD FOR IMPROVING MESSAGE DELIVERY IN VOICE SYSTEMS UTILIZING MICROPHONE AND TARGET SIGNAL-TO-NOISE RATIO
    7.
    发明申请
    SYSTEM AND METHOD FOR IMPROVING MESSAGE DELIVERY IN VOICE SYSTEMS UTILIZING MICROPHONE AND TARGET SIGNAL-TO-NOISE RATIO 有权
    利用麦克风和目标信号噪声比改善语音系统中的信息传递的系统和方法

    公开(公告)号:US20080147386A1

    公开(公告)日:2008-06-19

    申请号:US11612329

    申请日:2006-12-18

    IPC分类号: G10L11/00

    CPC分类号: G10L21/0208

    摘要: A method for delivering a message to a recipient in an environment with ambient noise includes the steps of recording the ambient noise in the environment at a certain time interval, analyzing the recorded ambient noise to obtain an average power Pnoise or a RMS amplitude Anoise of the ambient noise, providing a predetermined desired SNRdesired, calculating an average signal power Psignal or a RMS amplitude Asignal of the message to be delivered based on the Pnoise or Anoise and the desired SNRdesired, and adjusting a volume of the message to be delivered according to the Psignal or Asignal. Alternatively, the actual SNRactual will be computed and the message will be repeated if the SNRactual falls below the SNRmin. Systems for delivering a message to a recipient in an environment with ambient noise and computer-readable media having computer-executable instructions for carrying out the methods are also provided.

    摘要翻译: 用于在具有环境噪声的环境中向接收者发送消息的方法包括以一定时间间隔在环境中记录环境噪声的步骤,分析所记录的环境噪声以获得平均功率P SUB噪声 >或环境噪声的RMS幅度A SUB噪声,提供预期的期望SNR ,计算平均信号功率P SUB信号或RMS 将要传送的消息的幅度A 信号基于所需的噪声或A ,并且根据P 信号或A 信号调整要传送的消息的音量。 或者,将计算实际的SNR实际,并且如果SNR实际低于SNR ,则将重复该消息。 还提供了用于在具有环境噪声的环境中向接收者发送消息的系统以及具有用于执行方法的计算机可执行指令的计算机可读介质。

    RESULTS FROM SEARCH PROVIDERS USING A BROWSING-TIME RELEVANCY FACTOR
    8.
    发明申请
    RESULTS FROM SEARCH PROVIDERS USING A BROWSING-TIME RELEVANCY FACTOR 有权
    使用浏览时间相关因素的搜索提供商的结果

    公开(公告)号:US20150081688A1

    公开(公告)日:2015-03-19

    申请号:US14031245

    申请日:2013-09-19

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864 G06F17/30867

    摘要: A method for searching Web pages that begins with the identification of query criteria entered into a search provider. A set of Web pages that satisfies the query criteria are determined. Then, a page ranking is ascertained for each Web page in the set. The Web pages are presented in order by page ranking. The page ranking is based upon at least one relevancy factor that includes a browsing-time factor. The browsing-time factor can be calculated from browsing behavior exhibited by users, who provided similar query criteria. The set of users from which the browsing-time factor is calculated can include a current user, a set of users sharing characteristics with the current user, and/or a general set of users. Browsing behavior can include time spent at a Web page, where the browsed Web page is a page that was previously presented as a search result for the similar query criteria.

    摘要翻译: 一种用于搜索以识别输入到搜索提供者的查询条件为起点的网页的方法。 确定满足查询条件的一组网页。 然后,确定集合中每个网页的页面排名。 网页按页面顺序排列。 页面排名基于包括浏览时间因素的至少一个相关因素。 浏览时间因素可以从用户提供的浏览行为计算出来,他们提供了类似的查询条件。 计算浏览时间因子的用户组可以包括当前用户,与当前用户共享特征的一组用户和/或一般用户组。 浏览行为可以包括在网页上花费的时间,其中浏览的网页是之前作为类似查询条件的搜索结果呈现的页面。

    Improving results from search providers using a browsing-time relevancy factor
    9.
    发明授权
    Improving results from search providers using a browsing-time relevancy factor 失效
    使用浏览时间相关因素改善搜索提供商的结果

    公开(公告)号:US08635214B2

    公开(公告)日:2014-01-21

    申请号:US11460038

    申请日:2006-07-26

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864 G06F17/30867

    摘要: A method for searching Web pages that begins with the identification of query criteria entered into a search provider. A set of Web pages that satisfies the query criteria are determined. Then, a page ranking is ascertained for each Web page in the set. The Web pages are presented in order by page ranking. The page ranking is based upon at least one relevancy factor that includes a browsing-time factor. The browsing-time factor can be calculated from browsing behavior exhibited by users, who provided similar query criteria. The set of users from which the browsing-time factor is calculated can include a current user, a set of users sharing characteristics with the current user, and/or a general set of users. Browsing behavior can include time spent at a Web page, where the browsed Web page is a page that was previously presented as a search result for the similar query criteria.

    摘要翻译: 一种用于搜索以识别输入到搜索提供者的查询条件为起点的网页的方法。 确定满足查询条件的一组网页。 然后,确定集合中每个网页的页面排名。 网页按页面顺序排列。 页面排名基于包括浏览时间因素的至少一个相关因素。 浏览时间因素可以从用户提供的浏览行为计算出来,他们提供了类似的查询条件。 计算浏览时间因子的用户组可以包括当前用户,与当前用户共享特征的一组用户和/或一般用户组。 浏览行为可以包括在网页上花费的时间,其中浏览的网页是之前作为类似查询条件的搜索结果呈现的页面。

    ADJUSTING A SPEECH ENGINE FOR A MOBILE COMPUTING DEVICE BASED ON BACKGROUND NOISE
    10.
    发明申请
    ADJUSTING A SPEECH ENGINE FOR A MOBILE COMPUTING DEVICE BASED ON BACKGROUND NOISE 有权
    基于背景噪音调整用于移动计算设备的语音发动机

    公开(公告)号:US20120123776A1

    公开(公告)日:2012-05-17

    申请号:US13358097

    申请日:2012-01-25

    IPC分类号: G10L15/20

    CPC分类号: G10L21/0208 G10L15/20

    摘要: Methods, apparatus, and products are disclosed for adjusting a speech engine for a mobile computing device based on background noise, the mobile computing device operatively coupled to a microphone, that include: sampling, through the microphone, background noise for a plurality of operating environments in which the mobile computing device operates; generating, for each operating environment, a noise model in dependence upon the sampled background noise for that operating environment; and configuring the speech engine for the mobile computing device with the noise model for the operating environment in which the mobile computing device currently operates.

    摘要翻译: 公开了用于基于背景噪声调整用于移动计算设备的语音引擎的方法,装置和产品,该移动计算设备可操作地耦合到麦克风,其包括:通过麦克风对多个操作环境的背景噪声进行采样 其中移动计算设备运行; 根据所述操作环境的采样背景噪声,为每个操作环境产生噪声模型; 以及为移动计算设备当前操作的操作环境的噪声模型配置移动计算设备的语音引擎。