Multiple microphone voice activity detector
    1.
    发明授权
    Multiple microphone voice activity detector 有权
    多麦克风语音活动检测器

    公开(公告)号:US08954324B2

    公开(公告)日:2015-02-10

    申请号:US11864897

    申请日:2007-09-28

    CPC分类号: G10L25/78 G10L2021/02165

    摘要: Voice activity detection using multiple microphones can be based on a relationship between an energy at each of a speech reference microphone and a noise reference microphone. The energy output from each of the speech reference microphone and the noise reference microphone can be determined. A speech to noise energy ratio can be determined and compared to a predetermined voice activity threshold. In another embodiment, the absolute value of the autocorrelation of the speech and noise reference signals are determined and a ratio based on autocorrelation values is determined. Ratios that exceed the predetermined threshold can indicate the presence of a voice signal. The speech and noise energies or autocorrelations can be determined using a weighted average or over a discrete frame size.

    摘要翻译: 使用多个麦克风的语音活动检测可以基于语音基准麦克风和噪声参考麦克风各自的能量之间的关系。 可以确定来自每个语音参考麦克风和噪声参考麦克风的能量输出。 可以确定语音能量比,并将其与预定的语音活动阈值进行比较。 在另一个实施例中,确定语音和噪声参考信号的自相关的绝对值,并且确定基于自相关值的比率。 超过预定阈值的比率可以指示语音信号的存在。 可以使用加权平均值或离散的帧大小来确定语音和噪声能量或自相关性。

    Recognition processing of a plurality of streaming voice signals for determination of a responsive action thereto
    2.
    发明授权
    Recognition processing of a plurality of streaming voice signals for determination of a responsive action thereto 有权
    用于确定对其的响应动作的多个流式语音信号的识别处理

    公开(公告)号:US08751222B2

    公开(公告)日:2014-06-10

    申请号:US12470614

    申请日:2009-05-22

    IPC分类号: G10L11/06

    摘要: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage. In this manner, the present invention simplifies the storage requirements for contact centers and provides the opportunity to improve caller experiences by providing shorter reaction times to potentially problematic situations.

    摘要翻译: 分析诸如可能在联络中心处接收的流式语音信号或类似的操作,以检测一个或多个未发音的预定话语的发生。 预定的话语优选地构成在它们被发音的上下文中具有特定含义的单词和/或短语的词汇表。 在通话期间检测一个或多个预定话语导致确定检测到的话语的响应确定性重要性。 基于检测到的话语的响应决定性意义,可以进一步确定响应动作。 此外,也可以启动对应于检测到的话语的呼叫的长期存储。 相反,可以从短期存储中删除未检测到预定话语的呼叫。 以这种方式,本发明简化了联络中心的存储要求,并提供了通过为潜在的有问题的情况提供更短的反应时间来改善呼叫者体验的机会。

    System, method and program for voice detection
    3.
    发明授权
    System, method and program for voice detection 有权
    用于语音检测的系统,方法和程序

    公开(公告)号:US08694308B2

    公开(公告)日:2014-04-08

    申请号:US12744671

    申请日:2008-11-26

    IPC分类号: G10L11/06

    CPC分类号: G10L25/93

    摘要: A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined. By determining the voiced interval duration threshold value and the non-voiced interval duration threshold value, using the feature value found on a per frame basis and the threshold value for the feature value, the constraint of the shaping rule may be made weaker, or stronger in case the feature value found on a per frame basis can be regarded as being reliable or not, thereby allowing voice detection to be made without dependency upon a noise environment.

    摘要翻译: 一种用于语音检测的系统包括:特征值计算单元,其基于以每帧为基准的输入信号计算特征值;临时语音/非语音判定单元,其从所述临时语音/非语音判定单元临时确定有声间隔和非语音间隔, 基于每帧计算的特征值;以及语音/非语音决定单元,其使用在每帧基础上找到的特征值的比率来确定有声间隔持续时间阈值或非有声间隔持续时间阈值 使用确定的有声间隔持续时间阈值和确定的非有声间隔持续时间阈值来重新确定特征值的阈值并重新确定有声间隔和非语音间隔。 通过使用基于每帧的特征值和特征值的阈值来确定浊音间隔持续时间阈值和非有声间隔持续时间阈值,可以使成形规则的约束变弱或更强 在每帧基础上发现的特征值可以被认为是可靠的情况下,从而允许在不依赖于噪声环境的情况下进行语音检测。

    Device, method and program for voice detection and recording medium
    4.
    发明授权
    Device, method and program for voice detection and recording medium 有权
    用于语音检测和记录介质的设备,方法和程序

    公开(公告)号:US08589152B2

    公开(公告)日:2013-11-19

    申请号:US12993134

    申请日:2009-05-26

    IPC分类号: G10L21/02 G10L11/06

    CPC分类号: G10L25/78 G10L25/18

    摘要: To this end, a voice detection device includes a band-based power calculation unit that calculates a total of signal power values (sub-band power) of signals entered from the microphones from one preset frequency width (sub-band) to another. The voice detection device also includes a band-based noise estimation unit that estimates the sub-band based noise power, and a sub-band based SNR calculation unit. The sub-band based SNR calculation unit calculates a sub-band SNR from one sub-band to another to output the largest one of the sub-band SNRs as an SNR for a microphone of interest. The voice detection device further includes a voice/non-voice decision unit that determines the voice/non-voice using the SNR for the microphone of interest.

    摘要翻译: 为此,语音检测装置包括基于频带的功率计算单元,其计算从麦克风输入的信号从一个预设频率宽度(子带)到另一个的信号功率值(子带功率)的总和。 语音检测装置还包括估计基于子带的噪声功率的基于带的噪声估计单元和基于子带的SNR计算单元。 基于子带的SNR计算单元计算从一个子带到另一个子带的子带SNR,以将最大的一个子带SNR作为感兴趣的麦克风的SNR输出。 语音检测设备还包括语音/非语音决定单元,其使用感兴趣的麦克风的SNR来确定语音/非语音。

    Voice recording equipment and method
    5.
    发明授权
    Voice recording equipment and method 有权
    录音设备及方法

    公开(公告)号:US08504358B2

    公开(公告)日:2013-08-06

    申请号:US12913780

    申请日:2010-10-28

    IPC分类号: G10L11/06 G10L21/02 H04R29/00

    摘要: In a voice recording equipment and method, voice data from a speaker is received using a microphone. Threshold values T1 and T2 of surrounding environment of the voice recording equipment are determined. If an intensity of the voice data is less than or equal to the threshold value T2, the voice recording is stopped and the speaker is informed that the voice data is not useful. If the intensity of the voice data is greater than the threshold values, the voice data is stored into a storage unit.

    摘要翻译: 在录音设备和方法中,使用麦克风接收来自扬声器的语音数据。 确定录音设备周围环境的阈值T1和T2。 如果语音数据的强度小于或等于阈值T2,则停止语音记录,并且通知扬声器语音数据不可用。 如果语音数据的强度大于阈值,则语音数据被存储到存储单元中。

    Method and system for speech bandwidth extension
    7.
    发明授权
    Method and system for speech bandwidth extension 有权
    语音带宽扩展的方法和系统

    公开(公告)号:US08447617B2

    公开(公告)日:2013-05-21

    申请号:US12661344

    申请日:2010-03-15

    CPC分类号: G10L21/038

    摘要: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.

    摘要翻译: 提供了一种用于扩展第一频带语音信号的带宽以产生比第一频带语音信号更宽并且包括第一频带语音信号的第二频带语音信号的方法或装置。 该方法包括接收具有低截止频率和高截止频率的第一频带语音信号的分段; 确定段的高切断频率; 确定该段是有声还是无声; 如果所述段是有声的,则对所述段应用第一带宽扩展功能以在高频中产生第一带宽扩展; 如果所述段是无声的,则对所述段应用第二带宽扩展功能以在所述高频中产生第二带宽扩展; 使用第一带宽扩展和第二带宽扩展来扩展第一频带语音信号超出高截止频率。

    Tandem-free intersystem voice communication
    8.
    发明授权
    Tandem-free intersystem voice communication 有权
    无串联系统间语音通信

    公开(公告)号:US08432935B2

    公开(公告)日:2013-04-30

    申请号:US12181972

    申请日:2008-07-29

    IPC分类号: H04J3/16 G10L11/06

    CPC分类号: G10L19/173 H04W88/181

    摘要: Techniques are presented herein to provide tandem-free operation between two wireless terminals through two otherwise incompatible wireless networks. Specifically, embodiments provide tandem-free operation between a wireless terminal communicating through a continuous transmission (CTX) wireless channel to a wireless terminal communicating through a discontinuous transmission (DTX) wireless channel. In a first aspect, inactive speech frames are translated between DTX and CTX formats. In a second aspect, each wireless terminal includes an active speech decoder that is compatible with the active speech encoder on the opposite end of the mobile-to-mobile connection.

    摘要翻译: 本文提供了技术,以通过两个否则不兼容的无线网络在两个无线终端之间提供无串联操作。 具体地,实施例在通过连续传输(CTX)无线信道通信到通过不连续传输(DTX)无线信道进行通信的无线终端的无线终端之间提供无串联操作。 在第一方面,非活动语音帧在DTX和CTX格式之间被转换。 在第二方面,每个无线终端包括与移动到移动连接的相对端上的活动语音编码器兼容的活动语音解码器。

    Apparatus, method, and computer program product for judging speech/non-speech
    9.
    发明授权
    Apparatus, method, and computer program product for judging speech/non-speech 有权
    用于判断语音/非语音的装置,方法和计算机程序产品

    公开(公告)号:US08380500B2

    公开(公告)日:2013-02-19

    申请号:US12234976

    申请日:2008-09-22

    IPC分类号: G10L15/20 G10L11/06

    CPC分类号: G10L25/78

    摘要: A spectrum calculating unit calculates, for each of the frames, a spectrum by performing a frequency analysis on an acoustic signal. An estimating unit estimates a noise spectrum. An energy calculating unit calculates an energy characteristic amount. An entropy calculating unit calculates a normalized spectral entropy value. A generating unit generates a characteristic vector based on the energy characteristic amounts and the normalized spectral entropy values that have been calculated for a plurality of frames. A likelihood calculating unit calculates a speech likelihood value of a target frame that corresponds to the characteristic vector. In a case where the speech likelihood value is larger than a threshold value, a judging unit judges that the target frame is a speech frame.

    摘要翻译: 频谱计算单元通过对声学信号进行频率分析来计算每个帧的频谱。 估计单元估计噪声谱。 能量计算单元计算能量特征量。 熵计算单元计算归一化的光谱熵值。 生成单元基于针对多个帧计算的能量特征量和归一化频谱熵值生成特征向量。 似然度计算单元计算与特征向量对应的目标帧的语音似然值。 在语音似然值大于阈值的情况下,判断单元判断为目标帧是语音帧。

    Apparatus and method of code conversion and recording medium that records program for computer to execute the method
    10.
    发明授权
    Apparatus and method of code conversion and recording medium that records program for computer to execute the method 失效
    代码转换和记录介质的设备和方法,用于记录计算机执行方法的程序

    公开(公告)号:US08374852B2

    公开(公告)日:2013-02-12

    申请号:US11376436

    申请日:2006-03-16

    申请人: Atsushi Murashima

    发明人: Atsushi Murashima

    CPC分类号: G10L19/173 G10L25/78

    摘要: Disclosed is a code conversion method to convert a first code sequence conforming to a first speech coding scheme into a second code sequence conforming to a second speech coding scheme. The method includes the following steps. The first step discriminates whether the first code sequence corresponds to a speech part or to a non-speech part, and generates a numerical value that indicates the discrimination result as a control flag. The second step converts the first code sequence into the second code sequence and outputs said second code sequence, when the value of the control flag corresponds to the speech part. The third step outputs the second code sequence that corresponds to the value of the control flag, when the value of the control flag corresponds to the non-speech part.

    摘要翻译: 公开了一种将符合第一语音编码方案的第一代码序列转换为符合第二语音编码方案的第二代码序列的代码转换方法。 该方法包括以下步骤。 第一步骤鉴别第一代码序列是否对应于语音部分或非语音部分,并且产生指示鉴别结果作为控制标志的数值。 当控制标志的值对应于语音部分时,第二步骤将第一代码序列转换为第二代码序列并输出所述第二代码序列。 当控制标志的值对应于非语音部分时,第三步骤输出与控制标志的值对应的第二代码序列。