VOICE ACTIVITY DETECTION FEATURE BASED ON MODULATION-PHASE DIFFERENCES
    1.
    发明申请
    VOICE ACTIVITY DETECTION FEATURE BASED ON MODULATION-PHASE DIFFERENCES 审中-公开
    基于调相相位差的语音活动检测特征

    公开(公告)号:WO2017196422A1

    公开(公告)日:2017-11-16

    申请号:PCT/US2017/018362

    申请日:2017-02-17

    CPC classification number: G10L25/93 G10L25/18 G10L25/84 G10L2025/932

    Abstract: Speech processing methods may rely on voice activity detection (VAD) that separates speech from noise. Example embodiments of a computationally low complex VAD feature that is robust against various types of noise is introduced. By considering an alternating excitation structure of low and high frequencies, speech is detected with a high confidence. The computationally low complex VAD feature can cope even with the limited spectral resolution that may be typical for a communication system, such as an in-car-communication (ICC) system. Simulation results confirm the robustness of the computationally low complex VAD feature and show an increase in performance relative to established VAD features.

    Abstract translation: 语音处理方法可能依赖于将语音与噪声分开的语音活动检测(VAD)。 引入了针对各种类型的噪声具有鲁棒性的计算上较低的复杂VAD特征的示例实施例。 通过考虑低频和高频的交替激励结构,可以高置信度地检测语音。 即使在通信系统(例如车内通信(ICC)系统)中可能典型的有限的频谱分辨率下,计算上较低的复杂VAD特征也能应对。 仿真结果证实了计算复杂度低的VAD特性的鲁棒性,并显示出相对于已建立的VAD特性的性能提高。

    METHODS AND APPARATUS FOR ROBUST SPEAKER ACTIVITY DETECTION
    2.
    发明申请
    METHODS AND APPARATUS FOR ROBUST SPEAKER ACTIVITY DETECTION 审中-公开
    方法和装置用于稳健的演讲者活动检测

    公开(公告)号:WO2015047308A1

    公开(公告)日:2015-04-02

    申请号:PCT/US2013/062244

    申请日:2013-09-27

    Abstract: Method and apparatus to determine a speaker activity detection measure from energy-based characteristics of signals from a plurality of speaker-dedicated microphones, detect acoustic events using power spectra for the microphone signals, and determine a robust speaker activity detection measure from the speaker activity measure and the detected acoustic events.

    Abstract translation: 用于基于来自多个扬声器专用麦克风的信号的基于能量的特性来确定扬声器活动检测措施的方法和装置,使用所述麦克风信号的功率谱检测声学事件,以及从所述扬声器活动度量确定鲁棒的扬声器活动检测措施 和检测到的声学事件。

    TECHNIQUES FOR WAKE-UP WORD RECOGNITION AND RELATED SYSTEMS AND METHODS
    3.
    发明申请
    TECHNIQUES FOR WAKE-UP WORD RECOGNITION AND RELATED SYSTEMS AND METHODS 审中-公开
    唤醒词识别技术及相关系统和方法

    公开(公告)号:WO2017217978A1

    公开(公告)日:2017-12-21

    申请号:PCT/US2016/037495

    申请日:2016-06-15

    Abstract: A system for detection of at least one designated wake-up word for at least one speech- enabled application. The system comprises at least one microphone; and at least one computer hardware processor configured to perform: receiving an acoustic signal generated by the at least one microphone at least in part as a result of receiving an utterance spoken by a speaker; obtaining information indicative of the speaker's identity; interpreting the acoustic signal at least in part by determining, using the information indicative of the speaker's identity and automated speech recognition, whether the utterance spoken by the speaker includes the at least one designated wake-up word; and interacting with the speaker based, at least in part, on results of the interpreting.

    Abstract translation: 用于检测至少一个启用语音的应用程序的至少一个指定唤醒字的系统。 该系统包括至少一个麦克风; 以及至少一个计算机硬件处理器,被配置为执行:接收由至少一个麦克风生成的至少部分地作为接收讲话者说出的话语的结果的声学信号; 获得指示说话者身份的信息; 至少部分地通过使用指示说话者的身份和自动语音识别的信息确定说话者说出的话语是否包括至少一个指定的唤醒词来解释声学信号; 并至少部分基于口译结果与说话人交流。

    ENHANCED DE-ESSER FOR IN-CAR COMMUNICATION SYSTEMS
    4.
    发明申请
    ENHANCED DE-ESSER FOR IN-CAR COMMUNICATION SYSTEMS 审中-公开
    用于车载通信系统的增强型DE-ESSER

    公开(公告)号:WO2017196382A1

    公开(公告)日:2017-11-16

    申请号:PCT/US2016/049914

    申请日:2016-09-01

    CPC classification number: G10L21/0364 G10L25/18 H03G9/025

    Abstract: Methods and systems for deessing of speech signals are described. A deesser of a speech processing system includes an analyzer configured to receive a full spectral envelope for each time frame of a speech signal presented to the speech processing system, and to analyze the full spectral envelope to identify frequency content for deessing. The deesser also includes a compressor configured to receive results from the analyzer and to spectrally weight the speech signal as a function of results of the analyzer. The analyzer can be configured to calculate a psychoacoustic measure from the full spectral envelope, and may be further configured to detect sibilant sounds of the speech signal using the psychoacoustic measure. The psychoacoustic measure can include, for example, a measure of sharpness, and the analyzer may be further configured to calculate deesser weights based on the measure of sharpness. An example application includes in-car communications.

    Abstract translation: 描述了用于判定语音信号的方法和系统。 语音处理系统的分析者包括分析器,该分析器被配置为接收呈现给语音处理系统的语音信号的每个时间帧的全频谱包络,并且分析全频谱包络以识别用于拒收的频率内容。 除尘器还包括压缩机,其被配置为接收来自分析仪的结果并根据分析仪的结果对语音信号进行光谱加权。 分析器可以被配置为从全谱包络计算心理声学测量,并且还可以被配置为使用心理声学测量来检测语音信号的s音。 心理声学测量可以包括例如锐度的度量,并且分析器可以被进一步配置为基于锐度的度量来计算去除器权重。 示例应用程序包括车内通信。

    BABBLE NOISE SUPPRESSION
    5.
    发明申请
    BABBLE NOISE SUPPRESSION 审中-公开
    BABBLE噪音抑制

    公开(公告)号:WO2017136018A1

    公开(公告)日:2017-08-10

    申请号:PCT/US2016/062908

    申请日:2016-11-18

    Abstract: Systems and methods are introduced to perform noise suppression of an audio signal. The audio signal includes foreground speech components and background noise. The foreground speech components correspond to speech from a user's speaking into an audio receiving device. The background noise includes babble noise that includes speech from one or more interfering speakers. A soft speech detector determines, dynamically, a speech detection result indicating a likelihood of a presence of the foreground speech components in the audio signal. The speech detection result is employed to control, dynamically, an amount of attenuation of the noise suppression to reduce the babble noise in the audio signal. Further processing achieves a more stationary background and reduction of musical tones in the audio signal.

    Abstract translation: 引入了用于执行音频信号的噪声抑制的系统和方法。 音频信号包括前景语音分量和背景噪声。 前景语音分量对应于从用户说话进入音频接收设备的语音。 背景噪声包括包含来自一个或多个干扰扬声器的语音的混杂噪声。 软语音检测器动态地确定指示音频信号中存在前景语音分量的可能性的语音检测结果。 采用语音检测结果动态地控制噪声抑制的衰减量,以减少音频信号中的混杂噪声。 进一步的处理可以获得更稳定的背景并减少音频信号中的音调。

    SYSTEM AND METHOD FOR SPEECH DETECTION ADAPTATION
    6.
    发明申请
    SYSTEM AND METHOD FOR SPEECH DETECTION ADAPTATION 审中-公开
    用于语音检测适应的系统和方法

    公开(公告)号:WO2017119901A1

    公开(公告)日:2017-07-13

    申请号:PCT/US2016/012692

    申请日:2016-01-08

    CPC classification number: G10L25/78

    Abstract: A method for speech detection adaptation is provided. Embodiments may include receiving, at a processor, a speech signal corresponding to a particular environment and estimating one or more acoustical parameters that characterize the environment. In some embodiments, the one or more acoustical parameters are not configured to identify a known scenario. Embodiments may include dynamically controlling a speech detector based upon, at least in part, the one or more acoustical parameters, wherein dynamically controlling includes configuring feature parameters and detector parameters.

    Abstract translation: 提供了一种用于语音检测自适应的方法。 实施例可以包括在处理器处接收对应于特定环境的语音信号并估计表征环境的一个或多个声学参数。 在一些实施例中,一个或多个声学参数未被配置为识别已知场景。 实施例可以包括至少部分地基于一个或多个声学参数来动态地控制语音检测器,其中动态控制包括配置特征参数和检测器参数。

    METHODS AND APPARATUS FOR SPEECH SEGMENTATION USING MULTIPLE METADATA
    7.
    发明申请
    METHODS AND APPARATUS FOR SPEECH SEGMENTATION USING MULTIPLE METADATA 审中-公开
    使用多个元数据进行语音分段的方法和装置

    公开(公告)号:WO2016028254A1

    公开(公告)日:2016-02-25

    申请号:PCT/US2014/051457

    申请日:2014-08-18

    Abstract: Methods and apparatus to process microphone signals by a speech enhancement module to generate an audio stream signal including first and second metadata for use by a speech recognition module. In an embodiment, speech recognition is performed using endpointing information including transitioning from a silence state to a maybe speech state, in which data is buffered, based on the first metadata and transitioning to a speech state, in which speech recognition is performed, based upon the second metadata.

    Abstract translation: 通过语音增强模块来处理麦克风信号的方法和装置,以产生包括由语音识别模块使用的第一和第二元数据的音频流信号。 在一个实施例中,使用终点信息来执行语音识别,该终端信息包括:基于第一元数据,并且基于第一元数据转换到执行语音识别的语音状态,从静默状态转换到可能的语音状态 第二个元数据。

    WIND NOISE DETECTION FOR IN-CAR COMMUNICATION SYSTEMS WITH MULTIPLE ACOUSTIC ZONES
    8.
    发明申请
    WIND NOISE DETECTION FOR IN-CAR COMMUNICATION SYSTEMS WITH MULTIPLE ACOUSTIC ZONES 审中-公开
    具有多个声学区域的车载通信系统的风噪声检测

    公开(公告)号:WO2013187946A2

    公开(公告)日:2013-12-19

    申请号:PCT/US2013/027738

    申请日:2013-02-26

    CPC classification number: H04R3/002 G10L21/0208 H04R3/005 H04R2499/13

    Abstract: An in-car communication (ICC) system has multiple acoustic zones having varying acoustic environments. At least one input microphone within at least one acoustic zone develops a corresponding microphone signal from one or more system users. At least one loudspeaker within at least one acoustic zone provides acoustic audio to the system users. A wind noise module makes a determination of when wind noise is present in the microphone signal and modifies the microphone signal based on the determination.

    Abstract translation: 车载通信(ICC)系统具有多个具有不同声学环境的声学区域。 在至少一个声学区域内的至少一个输入麦克风产生来自一个或多个系统用户的相应麦克风信号。 至少一个声学区域内的至少一个扬声器向系统用户提供声音。 风噪声模块确定麦克风信号中何时出现风噪声,并根据确定修改麦克风信号。

    LOW COMPLEXITY DETECTION OF VOICED SPEECH AND PITCH ESTIMATION

    公开(公告)号:WO2019035835A1

    公开(公告)日:2019-02-21

    申请号:PCT/US2017/047361

    申请日:2017-08-17

    Abstract: A low-complexity method and apparatus for detection of voiced speech and pitch estimation is disclosed that is capable of dealing with special constraints given by applications where low latency is required, such as in-car communication (ICC) systems. An example embodiment employs very short frames that may capture only a single excitation impulse of voiced speech in an audio signal. A distance between multiple such impulses, corresponding to a pitch period, may be determined by evaluating phase differences between low-resolution spectra of the very short frames. An example embodiment may perform pitch estimation directly in a frequency domain based on the phase differences and reduce computational complexity by obviating transformation to a time domain to perform the pitch estimation. In an event the phase differences are determined to be substantially linear, an example embodiment enhances voice quality of the voiced speech by applying speech enhancement to the audio signal.

    BABBLE NOISE SUPPRESSION
    10.
    发明申请

    公开(公告)号:WO2017136018A9

    公开(公告)日:2017-08-10

    申请号:PCT/US2016/062908

    申请日:2016-11-18

    Abstract: Systems and methods are introduced to perform noise suppression of an audio signal. The audio signal includes foreground speech components and background noise. The foreground speech components correspond to speech from a user's speaking into an audio receiving device. The background noise includes babble noise that includes speech from one or more interfering speakers. A soft speech detector determines, dynamically, a speech detection result indicating a likelihood of a presence of the foreground speech components in the audio signal. The speech detection result is employed to control, dynamically, an amount of attenuation of the noise suppression to reduce the babble noise in the audio signal. Further processing achieves a more stationary background and reduction of musical tones in the audio signal.

Patent Agency Ranking