Voice Activity Detection Using A Soft Decision Mechanism

    公开(公告)号:US20180374500A1

    公开(公告)日:2018-12-27

    申请号:US15959743

    申请日:2018-04-23

    Inventor: Ron Wein

    CPC classification number: G10L25/78

    Abstract: Voice activity detection (VAD) is an enabling technology for a variety of speech based applications. Herein disclosed is a robust VAD algorithm that is also language independent. Rather than classifying short segments of the audio as either “speech” or “silence”, the VAD as disclosed herein employees a soft-decision mechanism. The VAD outputs a speech-presence probability, which is based on a variety of characteristics.

    ACOUSTIC SIGNATURE BUILDING FOR A SPEAKER FROM MULTIPLE SESSIONS
    45.
    发明申请
    ACOUSTIC SIGNATURE BUILDING FOR A SPEAKER FROM MULTIPLE SESSIONS 有权
    来自多个会议的演讲者的声音签名大楼

    公开(公告)号:US20160217793A1

    公开(公告)日:2016-07-28

    申请号:US15006575

    申请日:2016-01-26

    CPC classification number: G10L17/04 G10L15/26 G10L17/02 G10L17/16 G10L25/84

    Abstract: Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.

    Abstract translation: 这里公开的是使用第一遍盲目校正和产生说话者统计模型的二次盲目校正对音频数据进行着色的方法,其中第一次盲盲二值化处于每帧的基础上,而第二次盲目校正位于 每个单词基础的方法,以及仅基于每个音频会话中的扬声器的统计模型为公共扬声器创建声学签名的方法。

    ONTOLOGY EXPANSION USING ENTITY-ASSOCIATION RULES AND ABSTRACT RELATIONS
    46.
    发明申请
    ONTOLOGY EXPANSION USING ENTITY-ASSOCIATION RULES AND ABSTRACT RELATIONS 审中-公开
    使用实体协会规则和摘要关系的本体扩展

    公开(公告)号:US20160217128A1

    公开(公告)日:2016-07-28

    申请号:US15007703

    申请日:2016-01-27

    CPC classification number: G06F17/2775 G06F16/367 G06F17/277

    Abstract: A method for expanding an initial ontology via processing of communication data, wherein the initial ontology is a structural representation of language elements comprising a set of entities, a set of terms, a set of term-entity associations, a set of entity-association rules, a set of abstract relations, and a set of relation instances. A method for extracting a set of significant phrases and a set of significant phrase co-occurrences from an input set of documents further includes utilizing the terms to identify relations within the training set of communication data, wherein a relation is a pair of terms that appear in proximity to one another.

    Abstract translation: 一种用于通过处理通信数据来扩展初始本体的方法,其中初始本体是包括一组实体的语言元素的结构表示,一组术语,一组术语 - 实体关联,一组实体关联规则 ,一组抽象关系,以及一组关系实例。 一种用于从输入文档集中提取一组重要短语和一组重要短语共同出现的方法还包括利用这些术语来识别通信数据训练集内的关系,其中关系是出现的一对术语 彼此接近。

    Speaker separation in diarization
    47.
    发明授权
    Speaker separation in diarization 有权
    讲话者分离在diarization

    公开(公告)号:US09368116B2

    公开(公告)日:2016-06-14

    申请号:US14016783

    申请日:2013-09-03

    CPC classification number: G10L15/26 G10L17/06 G10L25/51 G10L25/78 G10L2025/783

    Abstract: The system and method of separating speakers in an audio file including obtaining an audio file. The audio file is transcribed into at least one text file by a transcription server. Homogenous speech segments are identified within the at least one text file. The audio file is segmented into homogenous audio segments that correspond to the identified homogenous speech segments. The homogenous audio segments of the audio file are separated into a first speaker audio file and second speaker audio file the first speaker audio file and the second speaker audio file are transcribed to produce a diarized transcript.

    Abstract translation: 分离音频文件中的扬声器的系统和方法,包括获取音频文件。 音频文件由转录服务器转录成至少一个文本文件。 在至少一个文本文件内识别均匀的语音段。 音频文件被分割成与所识别的同源语音片段对应的同质音频段。 音频文件的同质音频片段被分成第一扬声器音频文件和第二扬声器音频文件,第一扬声器音频文件和第二扬声器音频文件被转录以产生经过缩小的转录。

Patent Agency Ranking