PROVIDING A CONFIDENCE MEASURE FOR SPEAKER DIARIZATION
    1.
    发明申请
    PROVIDING A CONFIDENCE MEASURE FOR SPEAKER DIARIZATION 有权
    为声音解码提供安心的措施

    公开(公告)号:US20140029757A1

    公开(公告)日:2014-01-30

    申请号:US13557226

    申请日:2012-07-25

    IPC分类号: H04R29/00

    摘要: Method, system and computer product are provided for a computer implemented method for providing a confidence measure for speaker diarization. The method includes: receiving an audio session as unsegmented audio data; computing a spectral ratio of principal component analysis (PCA) of sections of the received audio session by a ratio between the largest eigenvalue and the second largest eigenvalue; using the PCA spectral ratio as a confidence measure for speaker diarization processing.

    摘要翻译: 提供方法,系统和计算机产品用于计算机实现的方法,用于提供讲话者变体的置信度量度。 该方法包括:接收音频会话作为未分段的音频数据; 通过最大特征值和第二大特征值之间的比率来计算所接收的音频会话的部分的主成分分析(PCA)的频谱比; 使用PCA频谱比作为扬声器分辨处理的置信度量。

    Method and system for speaker diarization

    公开(公告)号:US08554563B2

    公开(公告)日:2013-10-08

    申请号:US13609793

    申请日:2012-09-11

    申请人: Hagai Aronowitz

    发明人: Hagai Aronowitz

    IPC分类号: G10L15/00 G10L17/00

    CPC分类号: G10L17/12 G06N7/005 G10L17/02

    摘要: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.

    Skipping radio/television program segments
    3.
    发明授权
    Skipping radio/television program segments 有权
    跳过广播/电视节目片段

    公开(公告)号:US08473294B2

    公开(公告)日:2013-06-25

    申请号:US13436067

    申请日:2012-03-30

    IPC分类号: G10L15/00

    摘要: Techniques for notifying at least one entity of an occurrence of an event in an audio signal are provided. At least one preference is obtained from the at least one entity. An occurrence of an event in the audio signal is determined. The event is related to at least one of at least one speaker and at least one topic. The at least one entity is notified of the occurrence of the event in the audio signal, in accordance with the at least one preference.

    摘要翻译: 提供了用于通知音频信号中的事件的发生的至少一个实体的技术。 从至少一个实体获得至少一个偏好。 确定音频信号中事件的发生。 该事件与至少一个说话者和至少一个话题中的至少一个有关。 根据至少一个偏好,至少一个实体被通知音频信号中事件的发生。

    Compensation of intra-speaker variability in speaker diarization
    4.
    发明授权
    Compensation of intra-speaker variability in speaker diarization 有权
    演讲者变体中的演讲者间差异的补偿

    公开(公告)号:US08433567B2

    公开(公告)日:2013-04-30

    申请号:US12719024

    申请日:2010-04-08

    申请人: Hagai Aronowitz

    发明人: Hagai Aronowitz

    IPC分类号: G10L15/26

    CPC分类号: G10L17/02 G06N7/005

    摘要: A method, system, and computer program product compensation of intra-speaker variability in speaker diarization are provided. The method includes: dividing a speech session into segments of duration less than an average duration between speaker change; parameterizing each segment by a time dependent probability density function supervector, for example, using a Gaussian Mixture Model; computing a difference between successive segment supervectors; and computing a scatter measure such as a covariance matrix of the difference as an estimate of intra-speaker variability. The method further includes compensating the speech session for intra-speaker variability using the estimate of intra-speaker variability.

    摘要翻译: 提供了一种方法,系统和计算机程序,讲话者变体中的讲话者间变化的产品补偿。 该方法包括:将语音会话划分为持续时间小于说话者改变之间的平均持续时间; 通过时间依赖概率密度函数超向量来对每个段进行参数化,例如使用高斯混合模型; 计算连续段超级差分; 并且计算诸如该差的协方差矩阵的散射度量作为说话者间变异性的估计。 该方法还包括使用讲话者间变异性的估计来补偿用于讲话者间变异的语音会话。

    Skipping radio/television program segments
    5.
    发明授权
    Skipping radio/television program segments 失效
    跳过广播/电视节目片段

    公开(公告)号:US08249872B2

    公开(公告)日:2012-08-21

    申请号:US12193182

    申请日:2008-08-18

    IPC分类号: G10L15/00

    摘要: Techniques for notifying at least one entity of an occurrence of an event in an audio signal are provided. At least one preference is obtained from the at least one entity. An occurrence of an event in the audio signal is determined. The event is related to at least one of at least one speaker and at least one topic. The at least one entity is notified of the occurrence of the event in the audio signal, in accordance with the at least one preference.

    摘要翻译: 提供了用于通知音频信号中的事件的发生的至少一个实体的技术。 从至少一个实体获得至少一个偏好。 确定音频信号中事件的发生。 该事件与至少一个说话者和至少一个话题中的至少一个有关。 根据至少一个偏好,至少一个实体被通知音频信号中事件的发生。

    SKIPPING RADIO/TELEVISION PROGRAM SEGMENTS
    6.
    发明申请
    SKIPPING RADIO/TELEVISION PROGRAM SEGMENTS 有权
    移动无线电/电视节目部分

    公开(公告)号:US20120191459A1

    公开(公告)日:2012-07-26

    申请号:US13436067

    申请日:2012-03-30

    IPC分类号: G10L11/00

    摘要: Techniques for notifying at least one entity of an occurrence of an event in an audio signal are provided. At least one preference is obtained from the at least one entity. An occurrence of an event in the audio signal is determined. The event is related to at least one of at least one speaker and at least one topic. The at least one entity is notified of the occurrence of the event in the audio signal, in accordance with the at least one preference.

    摘要翻译: 提供了用于通知音频信号中的事件的发生的至少一个实体的技术。 从至少一个实体获得至少一个偏好。 确定音频信号中事件的发生。 该事件与至少一个说话者和至少一个话题中的至少一个有关。 根据至少一个偏好,至少一个实体被通知音频信号中事件的发生。

    COMPENSATION OF INTRA-SPEAKER VARIABILITY IN SPEAKER DIARIZATION
    7.
    发明申请
    COMPENSATION OF INTRA-SPEAKER VARIABILITY IN SPEAKER DIARIZATION 有权
    扬声器解码中内部扬声器可变性的补偿

    公开(公告)号:US20110251843A1

    公开(公告)日:2011-10-13

    申请号:US12719024

    申请日:2010-04-08

    申请人: Hagai Aronowitz

    发明人: Hagai Aronowitz

    IPC分类号: G10L15/26

    CPC分类号: G10L17/02 G06N7/005

    摘要: A method, system, and computer program product compensation of intra-speaker variability in speaker diarization are provided. The method includes: dividing a speech session into segments of duration less than an average duration between speaker change; parameterizing each segment by a time dependent probability density function supervector, for example, using a Gaussian Mixture Model; computing a difference between successive segment supervectors; and computing a scatter measure such as a covariance matrix of the difference as an estimate of intra-speaker variability. The method further includes compensating the speech session for intra-speaker variability using the estimate of intra-speaker variability.

    摘要翻译: 提供了一种方法,系统和计算机程序,讲话者变体中的讲话者间变化的产品补偿。 该方法包括:将语音会话划分为持续时间小于说话者改变之间的平均持续时间; 通过时间依赖概率密度函数超向量来对每个段进行参数化,例如使用高斯混合模型; 计算连续段超级差分; 并且计算诸如该差的协方差矩阵的散射度量作为说话者间变异性的估计。 该方法还包括使用讲话者间变异性的估计来补偿用于讲话者间变异的语音会话。

    Phoneme lattice construction and its application to speech recognition and keyword spotting
    8.
    发明申请
    Phoneme lattice construction and its application to speech recognition and keyword spotting 有权
    音素格子构造及其应用于语音识别和关键词识别

    公开(公告)号:US20050010412A1

    公开(公告)日:2005-01-13

    申请号:US10616310

    申请日:2003-07-07

    申请人: Hagai Aronowitz

    发明人: Hagai Aronowitz

    摘要: An arrangement is provided for using a phoneme lattice for speech recognition and/or keyword spotting. The phoneme lattice may be constructed for an input speech signal and searched to produce a textual representation for the input speech signal and/or to determine if the input speech signal contains targeted keywords. An expectation maximization (EM) trained phoneme confusion matrix may be used when searching the phoneme lattice. The phoneme lattice may be constructed in a client and sent to a server, which may search the phoneme lattice to produce a result.

    摘要翻译: 提供了一种用于使用音素晶格进行语音识别和/或关键词检测的装置。 可以为输入语音信号构造音素格,并搜索以产生用于输入语音信号的文本表示和/或确定输入语音信号是否包含目标关键词。 当搜索音素格时,可以使用期望最大化(EM)训练的音素混淆矩阵。 音素格可以在客户端中构建并发送到服务器,服务器可以搜索音素格以产生结果。

    Method and system for speaker diarization
    9.
    发明授权
    Method and system for speaker diarization 有权
    扬声器分离的方法和系统

    公开(公告)号:US08554562B2

    公开(公告)日:2013-10-08

    申请号:US12618731

    申请日:2009-11-15

    申请人: Hagai Aronowitz

    发明人: Hagai Aronowitz

    IPC分类号: G10L15/00 G10L17/00

    CPC分类号: G10L17/12 G06N7/005 G10L17/02

    摘要: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.

    摘要翻译: 提供了一种用于扬声器变焦的方法和系统。 获得单个扬声器和/或扬声器组的预训练声学模型。 具有多个扬声器的语音数据被接收并分成帧。 对于帧,确定扩展声学特征向量以包括相对于背景群体模型的预训练模型的对数似然比。 扩展声特征向量用于分割和聚类算法。

    Method of Trainable Speaker Diarization
    10.
    发明申请
    Method of Trainable Speaker Diarization 审中-公开
    可训练的演讲者Diarization的方法

    公开(公告)号:US20090319269A1

    公开(公告)日:2009-12-24

    申请号:US12144659

    申请日:2008-06-24

    申请人: Hagai Aronowitz

    发明人: Hagai Aronowitz

    IPC分类号: H04M3/00

    CPC分类号: G10L17/00

    摘要: A novel and useful method of using labeled training data and machine learning tools to train a speaker diarization system. Intra-speaker variability profiles are created from training data consisting of an audio stream labeled where speaker changes occur (i.e. which participant is speaking at any given time). These intra-speaker variability profiles are then applied to an unlabeled audio stream to segment the audio stream into speaker homogeneous segments and to cluster segments according to speaker identity.

    摘要翻译: 一种使用标记的训练数据和机器学习工具来训练说话者二元化系统的新颖有用的方法。 由演讲者变化发生的音频流(即哪个参与者在任何给定的时间讲话)组成的训练数据创建了讲话间内变化概况。 然后将这些扬声器内变化概况应用于未标记的音频流,以根据说话人身份将音频流分割成扬声器均匀的段并将其分段。