SYSTEMS, COMPUTER-IMPLEMENTED METHODS, AND TANGIBLE COMPUTER-READABLE STORAGE MEDIA FOR TRANSCRIPTION ALIGNMENT
    1.
    发明申请
    SYSTEMS, COMPUTER-IMPLEMENTED METHODS, AND TANGIBLE COMPUTER-READABLE STORAGE MEDIA FOR TRANSCRIPTION ALIGNMENT 有权
    系统,计算机实现方法和可变数据可读存储介质用于转码对齐

    公开(公告)号:US20160198234A1

    公开(公告)日:2016-07-07

    申请号:US15071644

    申请日:2016-03-16

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for captioning a media presentation. The method includes receiving automatic speech recognition (ASR) output from a media presentation and a transcription of the media presentation. The method includes selecting via a processor a pair of anchor words in the media presentation based on the ASR output and transcription and generating captions by aligning the transcription with the ASR output between the selected pair of anchor words. The transcription can be human-generated. Selecting pairs of anchor words can be based on a similarity threshold between the ASR output and the transcription. In one variation, commonly used words on a stop list are ineligible as anchor words. The method includes outputting the media presentation with the generated captions. The presentation can be a recording of a live event.

    摘要翻译: 本文公开了系统,计算机实现的方法和用于标题媒体呈现的有形的计算机可读存储介质。 该方法包括从媒体呈现和媒体呈现的转录接收自动语音识别(ASR)输出。 该方法包括:通过处理器选择基于ASR输出和转录的媒体呈现中的一对锚定词,并通过将转录与所选择的一对锚点之间的ASR输出对齐来产生标题。 转录可以是人类产生的。 选择锚点对可以基于ASR输出和转录之间的相似性阈值。 在一个变体中,停止列表上常用的单词不符合锚点词。 该方法包括用生成的标题输出媒体呈现。 演示文稿可以是现场直播的录音。

    System and Method for Generalized Preselection for Unit Selection Synthesis
    2.
    发明申请
    System and Method for Generalized Preselection for Unit Selection Synthesis 有权
    单位选择综合广义预选系统与方法

    公开(公告)号:US20140350940A1

    公开(公告)日:2014-11-27

    申请号:US14454123

    申请日:2014-08-07

    IPC分类号: G10L13/06 G10L13/047

    摘要: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

    摘要翻译: 本文公开了用于单元选择合成的系统,计算机实现的方法和计算机可读存储介质。 该方法使得计算设备将辅助电话机添加到具有现有电话机的语音合成器前端,基于补充电话机修改单元预选过程,基于修改的单位预选过程从辅助电话机和现有电话机中预选单元 ,并根据预选单位产生语音。 补充手机可以是现有手机的变体,可以包括字边界特征,可以包括其中初始辅音簇和一些字边界用变音符标记的群集特征,可以包括将单位标记为源自于 功能词或内容词,和/或可以包括语音前或后声部特征。 语音合成器前端可以将补充的电话机作为额外的功能。

    AUTOMATED DETECTION AND FILTERING OF AUDIO ADVERTISEMENTS
    3.
    发明申请
    AUTOMATED DETECTION AND FILTERING OF AUDIO ADVERTISEMENTS 有权
    自动检测和过滤音频广告

    公开(公告)号:US20160085858A1

    公开(公告)日:2016-03-24

    申请号:US14865979

    申请日:2015-09-25

    IPC分类号: G06F17/30 G10L15/08

    摘要: Apparatuses, systems, methods, and media for filtering a data stream are provided. The data stream is analyzed based on an acoustic parameter to determine extraneous portions in which a first predetermined condition is satisfied. When a first extraneous portion is separated from a second extraneous portion by a non-extraneous portion in which the first predetermined condition is not satisfied, it is determined whether the first extraneous portion being separated from the second extraneous portion by the non-extraneous portion satisfies a second predetermined condition. At least one of the first extraneous portion and the second extraneous portion is deleted from the data stream to produce a filtered data stream in response to determining the second predetermined condition is satisfied.

    摘要翻译: 提供了用于过滤数据流的设备,系统,方法和介质。 基于声学参数分析数据流,以确定满足第一预定条件的外部部分。 当通过不满足第一预定条件的非外部部分将第一外部部分从第二外部部分分离时,确定第一外部部分是否通过非外来部分与第二外部部分分离满足 第二预定条件。 响应于确定满足第二预定条件,从数据流中删除第一外部部分和第二外部部分中的至少一个以产生经滤波的数据流。

    Systems, Computer-Implemented Methods, and Tangible Computer-Readable Storage Media For Transcription Alighnment
    4.
    发明申请
    Systems, Computer-Implemented Methods, and Tangible Computer-Readable Storage Media For Transcription Alighnment 有权
    系统,计算机实现的方法和有形计算机可读存储介质的转录缩写

    公开(公告)号:US20150046160A1

    公开(公告)日:2015-02-12

    申请号:US14492616

    申请日:2014-09-22

    IPC分类号: G10L15/26 G10L21/06

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for captioning a media presentation. The method includes receiving automatic speech recognition (ASR) output from a media presentation and a transcription of the media presentation. The method includes selecting via a processor a pair of anchor words in the media presentation based on the ASR output and transcription and generating captions by aligning the transcription with the ASR output between the selected pair of anchor words. The transcription can be human-generated. Selecting pairs of anchor words can be based on a similarity threshold between the ASR output and the transcription. In one variation, commonly used words on a stop list are ineligible as anchor words. The method includes outputting the media presentation with the generated captions. The presentation can be a recording of a live event.

    摘要翻译: 本文公开了系统,计算机实现的方法和用于标题媒体呈现的有形的计算机可读存储介质。 该方法包括从媒体呈现和媒体呈现的转录接收自动语音识别(ASR)输出。 该方法包括:通过处理器选择基于ASR输出和转录的媒体呈现中的一对锚定词,并通过将转录与所选择的一对锚点之间的ASR输出对齐来产生标题。 转录可以是人类产生的。 选择锚点对可以基于ASR输出和转录之间的相似性阈值。 在一个变体中,停止列表上常用的单词不符合锚点词。 该方法包括用生成的标题输出媒体呈现。 演示文稿可以是现场直播的录音。

    SYSTEM AND METHOD FOR DATA-DRIVEN INTONATION GENERATION
    6.
    发明申请
    SYSTEM AND METHOD FOR DATA-DRIVEN INTONATION GENERATION 审中-公开
    用于数据驱动产生的系统和方法

    公开(公告)号:US20150149178A1

    公开(公告)日:2015-05-28

    申请号:US14087840

    申请日:2013-11-22

    IPC分类号: G10L13/02

    CPC分类号: G10L13/10

    摘要: Systems, methods, and computer-readable storage media for text-to-speech processing having an improved intonation. The system first receives text to be converted to speech, the text having a first segment and a second segment. The system then compares the text to a database of stored utterances, identifying in the database a first utterance corresponding to the first segment and determining an intonation of the first utterance. When the database does not contain a second utterance corresponding to the second segment, the system generates the speech corresponding to the text by combining the first utterance with a generated second utterance corresponding to the second segment, the generated second utterance having the intonation matching, or based on, the first utterance. These actions lead to an improved, smoother, more human-like synthetic speech output from the system.

    摘要翻译: 用于具有改进的语调的文本到语音处理的系统,方法和计算机可读存储介质。 系统首先接收要转换为语音的文本,该文本具有第一段和第二段。 然后,系统将文本与存储的话语的数据库进行比较,在数据库中标识对应于第一段的第一个发音,并确定第一个发音的语调。 当数据库不包含对应于第二段的第二话语时,系统通过将第一个发音与对应于第二个段的所生成的第二个发音组合,生成具有语调匹配的第二个话语,或者 基于第一个话语。 这些动作导致系统的改进,更平滑,更人性化的合成语音输出。

    SYSTEMS, COMPUTER-IMPLEMENTED METHODS, AND TANGIBLE COMPUTER-READABLE STORAGE MEDIA FOR TRANSCRIPTION ALIGNMENT
    7.
    发明申请
    SYSTEMS, COMPUTER-IMPLEMENTED METHODS, AND TANGIBLE COMPUTER-READABLE STORAGE MEDIA FOR TRANSCRIPTION ALIGNMENT 有权
    系统,计算机实现方法和可变数据可读存储介质用于转码对齐

    公开(公告)号:US20170061986A1

    公开(公告)日:2017-03-02

    申请号:US15350339

    申请日:2016-11-14

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for captioning a media presentation. The method includes receiving automatic speech recognition (ASR) output from a media presentation and a transcription of the media presentation. The method includes selecting via a processor a pair of anchor words in the media presentation based on the ASR output and transcription and generating captions by aligning the transcription with the ASR output between the selected pair of anchor words. The transcription can be human-generated. Selecting pairs of anchor words can be based on a similarity threshold between the ASR output and the transcription. In one variation, commonly used words on a stop list are ineligible as anchor words. The method includes outputting the media presentation with the generated captions. The presentation can be a recording of a live event.

    摘要翻译: 本文公开了系统,计算机实现的方法和用于标题媒体呈现的有形的计算机可读存储介质。 该方法包括从媒体呈现和媒体呈现的转录接收自动语音识别(ASR)输出。 该方法包括:通过处理器选择基于ASR输出和转录的媒体呈现中的一对锚定词,并通过将转录与所选择的一对锚点之间的ASR输出对齐来产生标题。 转录可以是人类产生的。 选择锚点对可以基于ASR输出和转录之间的相似性阈值。 在一个变体中,停止列表上常用的单词不符合锚点词。 该方法包括用生成的标题输出媒体呈现。 演示文稿可以是现场直播的录音。

    System and Method for Cloud-Based Text-to-Speech Web Services
    8.
    发明申请
    System and Method for Cloud-Based Text-to-Speech Web Services 有权
    基于云的文本到语音Web服务的系统和方法

    公开(公告)号:US20150221298A1

    公开(公告)日:2015-08-06

    申请号:US14684893

    申请日:2015-04-13

    IPC分类号: G10L13/04

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating speech. One variation of the method is from a server side, and another variation of the method is from a client side. The server side method, as implemented by a network-based automatic speech processing system, includes first receiving, from a network client independent of knowledge of internal operations of the system, a request to generate a text-to-speech voice. The request can include speech samples, transcriptions of the speech samples, and metadata describing the speech samples. The system extracts sound units from the speech samples based on the transcriptions and generates an interactive demonstration of the text-to-speech voice based on the sound units, the transcriptions, and the metadata, wherein the interactive demonstration hides a back end processing implementation from the network client. The system provides access to the interactive demonstration to the network client.

    摘要翻译: 本文公开了用于产生语音的系统,方法和非暂时的计算机可读存储介质。 该方法的一个变体是来自服务器端,并且该方法的另一变体是来自客户端。 由基于网络的自动语音处理系统实现的服务器端方法包括首先从网络客户端接收与系统的内部操作相关的知识,生成文本到语音语音的请求。 该请求可以包括语音样本,语音样本的转录以及描述语音样本的元数据。 该系统基于转录从语音样本中提取声音单元,并基于声音单元,转录和元数据生成文本到语音语音的交互式演示,其中交互式演示隐藏了后端处理实现 网络客户端。 该系统提供对网络客户端的交互式演示的访问。

    AUTOMATIC DISCLOSURE DETECTION
    9.
    发明申请
    AUTOMATIC DISCLOSURE DETECTION 有权
    自动披露检测

    公开(公告)号:US20130166293A1

    公开(公告)日:2013-06-27

    申请号:US13772509

    申请日:2013-02-21

    IPC分类号: G10L15/04

    摘要: A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.

    摘要翻译: 提供了检测预定短语以确定顺应性质量的方法。 该方法包括基于预定短语与通信网络中的发送者和接收者之间的通信之间的比较来确定事件或前兆事件中的至少一个是否已经发生,并且基于存在 与事件相关联的预定短语或与通信中的前体事件相关联的预定短语的存在。