System and method for low-latency web-based text-to-speech without plugins
    2.
    发明授权
    System and method for low-latency web-based text-to-speech without plugins 有权
    用于低延迟基于Web的文本到语音而不需要插件的系统和方法

    公开(公告)号:US09240180B2

    公开(公告)日:2016-01-19

    申请号:US13308860

    申请日:2011-12-01

    IPC分类号: G10L13/00 G10L13/08 G10L13/10

    CPC分类号: G10L13/04 G10L13/10

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for reducing latency in web-browsing TTS systems without the use of a plug-in or Flash® module. A system configured according to the disclosed methods allows the browser to send prosodically meaningful sections of text to a web server. A TTS server then converts intonational phrases of the text into audio and responds to the browser with the audio file. The system saves the audio file in a cache, with the file indexed by a unique identifier. As the system continues converting text into speech, when identical text appears the system uses the cached audio corresponding to the identical text without the need for re-synthesis via the TTS server.

    摘要翻译: 这里公开的是系统,方法和非暂时的计算机可读存储介质,用于在不使用插件或Flash®模块的情况下减少网页浏览TTS系统中的延迟。 根据所公开的方法配置的系统允许浏览器向web服务器发送具有韵律意义的文本段。 然后,TTS服务器将文本的语调短语转换为音频,并用音频文件对浏览器进行响应。 系统将音频文件保存在缓存中,文件由唯一标识符进行索引。 随着系统继续将文本转换为语音,当出现相同的文本时,系统使用对应于相同文本的缓存音频,而不需要经由TTS服务器重新合成。

    USER PROFILE AND ITS LOCATION IN A CLUSTERED PROFILE LANDSCAPE
    4.
    发明申请
    USER PROFILE AND ITS LOCATION IN A CLUSTERED PROFILE LANDSCAPE 有权
    用户配置文件及其位置在一个集合的配置文件中

    公开(公告)号:US20120089605A1

    公开(公告)日:2012-04-12

    申请号:US12901075

    申请日:2010-10-08

    IPC分类号: G06F17/30

    摘要: Delivering targeted content includes collecting, via at least one tangible processor, user activity data for users during a specified time period. questions asked by the users during the specified time period are extracted from the user activity data, via the at least one tangible processor, and stored in user profiles for the users. The user profiles are clustered, via the at least one tangible processor, based on the questions asked. Targeted content is delivered, via the at least one tangible processor, to a subset of the users based on the clustering.

    摘要翻译: 提供目标内容包括通过至少一个有形处理器在指定的时间段内收集用户的用户活动数据。 通过至少一个有形处理器从用户活动数据中提取在指定时间段内由用户询问的问题,并存储在用户的用户简档中。 基于所提出的问题,通过至少一个有形处理器对用户简档进行聚类。 基于聚类,经由至少一个有形处理器将目标内容传递给用户的子集。

    System and method for dynamic facial features for speaker recognition
    5.
    发明授权
    System and method for dynamic facial features for speaker recognition 有权
    用于说话者识别的动态面部特征的系统和方法

    公开(公告)号:US08897500B2

    公开(公告)日:2014-11-25

    申请号:US13101704

    申请日:2011-05-05

    IPC分类号: G06K9/00 G10L17/24 G06F21/32

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.

    摘要翻译: 本文公开了用于执行说话者验证的系统,方法和非暂时的计算机可读存储介质。 被配置为实施该方法的系统接收到验证说话者的请求,产生对该请求是唯一的文本挑战,并且响应该请求提示说话者发出文本挑战。 然后当扬声器发出文本挑战时,系统记录扬声器的动态图像特征,并且基于动态图像特征和文本挑战来执行说话者验证。 录制扬声器的动态图像功能可以包括在说出文本挑战时录制扬声器的视频。 动态特征可以包括扬声器的头部,嘴唇,嘴巴,眼睛和/或眉毛的运动模式。 动态图像特征可以涉及讲话者讲话的语音内容,语音韵律以及响应于挑战内容的说话者的面部表情。

    SYSTEM AND METHOD FOR DYNAMIC FACIAL FEATURES FOR SPEAKER RECOGNITION
    6.
    发明申请
    SYSTEM AND METHOD FOR DYNAMIC FACIAL FEATURES FOR SPEAKER RECOGNITION 有权
    用于声音识别的动态特征的系统和方法

    公开(公告)号:US20120281885A1

    公开(公告)日:2012-11-08

    申请号:US13101704

    申请日:2011-05-05

    IPC分类号: G06K9/00

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.

    摘要翻译: 本文公开了用于执行说话者验证的系统,方法和非暂时的计算机可读存储介质。 被配置为实施该方法的系统接收到验证说话者的请求,产生对该请求是唯一的文本挑战,并且响应该请求提示说话者发出文本挑战。 然后当扬声器发出文本挑战时,系统记录扬声器的动态图像特征,并且基于动态图像特征和文本挑战来执行说话者验证。 录制扬声器的动态图像功能可以包括在说出文本挑战时录制扬声器的视频。 动态特征可以包括扬声器的头部,嘴唇,嘴巴,眼睛和/或眉毛的运动模式。 动态图像特征可以涉及讲话者讲话的语音内容,语音韵律以及响应于挑战内容的说话者的面部表情。

    PERSONAL CUSTOMER CARE AGENT
    7.
    发明申请
    PERSONAL CUSTOMER CARE AGENT 有权
    个人客户服务代理

    公开(公告)号:US20120095861A1

    公开(公告)日:2012-04-19

    申请号:US12905172

    申请日:2010-10-15

    摘要: Aggregating information includes configuring, by at least one processor, a user profile that indicates user preferences for aggregated information. The at least one processor monitors information sources including the World Wide Web, business websites of interest, and online social media, based on the user preferences. Data obtained from the information sources is presented, based on the monitoring, by the at least one processor, in accordance with a presentation format, as the aggregated information, based on the user preferences. The at least one processor triggers updating of the presented aggregated information based on a change to the data at least one of the information sources and a change to the user profile.

    摘要翻译: 聚合信息包括由至少一个处理器配置指示聚合信息的用户偏好的用户简档。 至少一个处理器基于用户偏好来监视包括万维网,感兴趣的商业网站和在线社交媒体的信息源。 基于信息源获得的数据,基于由至少一个处理器根据用户偏好根据呈现格式作为聚合信息的监视来呈现。 所述至少一个处理器基于对所述数据的至少一个信息源的改变以及对所述用户简档的改变来触发对所呈现的聚合信息的更新。

    SYSTEM AND METHOD FOR TIGHTLY COUPLING AUTOMATIC SPEECH RECOGNITION AND SEARCH
    8.
    发明申请
    SYSTEM AND METHOD FOR TIGHTLY COUPLING AUTOMATIC SPEECH RECOGNITION AND SEARCH 有权
    用于轻松连接自动语音识别和搜索的系统和方法

    公开(公告)号:US20110144995A1

    公开(公告)日:2011-06-16

    申请号:US12638649

    申请日:2009-12-15

    IPC分类号: G10L15/00 G06F17/30

    摘要: Disclosed herein are systems, methods, and computer-readable storage media for performing a search. A system configured to practice the method first receives from an automatic speech recognition (ASR) system a word lattice based on speech query and receives indexed documents from an information repository. The system composes, based on the word lattice and the indexed documents, at least one triple including a query word, selected indexed document, and weight. The system generates an N-best path through the word lattice based on the at least one triple and re-ranks ASR output based on the N-best path. The system aggregates each weight across the query words to generate N-best listings and returns search results to the speech query based on the re-ranked ASR output and the N-best listings. The lattice can be a confusion network, the arc density of which can be adjusted for a desired performance level.

    摘要翻译: 本文公开了用于执行搜索的系统,方法和计算机可读存储介质。 配置为实施该方法的系统首先从自动语音识别(ASR)系统接收基于语音查询的字格,并从信息库接收索引的文档。 该系统基于字格和索引文档,组合至少一个包括查询词,选择的索引文档和权重的三元组。 该系统基于至少一个三重生成通过该字格的N个最佳路径,并且基于该N最佳路径重新排列ASR输出。 系统通过查询字聚合每个权重,以产生N最佳列表,并根据重新排列的ASR输出和N最佳列表将搜索结果返回给语音查询。 晶格可以是混淆网络,其电弧密度可以针对期望的性能水平进行调整。

    System and method for generating challenge utterances for speaker verification
    9.
    发明授权
    System and method for generating challenge utterances for speaker verification 有权
    用于产生演讲者验证的挑战话语的系统和方法

    公开(公告)号:US09318114B2

    公开(公告)日:2016-04-19

    申请号:US12954094

    申请日:2010-11-24

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media relating to speaker verification. In one aspect, a system receives a first user identity from a second user, and, based on the identity, accesses voice characteristics. The system randomly generates a challenge sentence according to a rule and/or grammar, based on the voice characteristics, and prompts the second user to speak the challenge sentence. The system verifies that the second user is the first user if the spoken challenge sentence matches the voice characteristics. In an enrollment aspect, the system constructs an enrollment phrase that covers a minimum threshold of unique speech sounds based on speaker-distinctive phonemes, phoneme clusters, and prosody. Then user utters the enrollment phrase and extracts voice characteristics for the user from the uttered enrollment phrase. The system generates a user profile, based on the voice characteristics, for generating random challenge sentences according to a grammar.

    摘要翻译: 本文公开了与说话者验证有关的系统,方法和非暂时的计算机可读存储介质。 在一个方面,系统从第二用户接收第一用户身份,并且基于身份访问语音特征。 该系统根据语音特征根据规则和/或语法随机生成挑战句,并提示第二用户说出挑战句。 系统验证第二用户是否是第一个用户,如果口头的挑战句子与语音特征相匹配。 在注册方面,系统构建了一个基于扬声器独特音素,音素集群和韵律,覆盖独特语音的最小阈值的注册短语。 然后用户发出注册短语,并从发出的注册短语中提取用户的语音特征。 该系统基于语音特征生成用户简档,用于根据语法产生随机挑战语句。

    Personal customer care agent
    10.
    发明授权
    Personal customer care agent 有权
    个人客户服务代理

    公开(公告)号:US09076146B2

    公开(公告)日:2015-07-07

    申请号:US12905172

    申请日:2010-10-15

    摘要: Aggregating information includes configuring, by at least one processor, a user profile that indicates user preferences for aggregated information. The at least one processor monitors information sources including the World Wide Web, business websites of interest, and online social media, based on the user preferences. Data obtained from the information sources is presented, based on the monitoring, by the at least one processor, in accordance with a presentation format, as the aggregated information, based on the user preferences. The at least one processor triggers updating of the presented aggregated information based on a change to the data at least one of the information sources and a change to the user profile.

    摘要翻译: 聚合信息包括由至少一个处理器配置指示聚合信息的用户偏好的用户简档。 至少一个处理器基于用户偏好来监视包括万维网,感兴趣的商业网站和在线社交媒体的信息源。 基于信息源获得的数据,基于由至少一个处理器根据用户偏好根据呈现格式作为聚合信息的监视来呈现。 所述至少一个处理器基于对所述数据的至少一个信息源的改变以及对所述用户简档的改变来触发对所呈现的聚合信息的更新。