Method and system for adjusting user speech in a communication session
    82.
    发明授权
    Method and system for adjusting user speech in a communication session 有权
    在通信会话中调整用户语音的方法和系统

    公开(公告)号:US09558756B2

    公开(公告)日:2017-01-31

    申请号:US14065903

    申请日:2013-10-29

    Abstract: A system that incorporates the subject disclosure may include, for example, receive user speech captured at a second end user device during a communication session between the second end user device and a first end user device, apply speech recognition to the user speech, identify an unclear word in the user speech based on the speech recognition, adjust the user speech to generate adjusted user speech by replacing all or a portion of the unclear word with replacement audio content, and provide the adjusted user speech to the first end user device during the communication session. Other embodiments are disclosed.

    Abstract translation: 结合本发明的系统可以包括例如在第二最终用户设备和第一终端用户设备之间的通信会话期间接收在第二最终用户设备处捕获的用户语音,对用户语音应用语音识别, 基于语音识别的用户语音中的不清楚的单词,通过用替换的音频内容替换全部或一部分不清楚的单词来调整用户语音以产生调整后的用户语音,并且在该时间段期间向第一终端用户设备提供经调整的用户语音 沟通会话 公开了其他实施例。

    Audio encoder and decoder with pitch prediction
    83.
    发明授权
    Audio encoder and decoder with pitch prediction 有权
    具有音调预测的音频编码器和解码器

    公开(公告)号:US09558754B2

    公开(公告)日:2017-01-31

    申请号:US15097201

    申请日:2016-04-12

    Abstract: In one embodiment, an audio decoder for decoding an encoded audio bitstream is disclosed. The audio decoder is capable of being operated in at least three different decoding modes. The audio decoder includes a demultiplexer for obtaining audio data and control information from the encoded audio bitstream. The audio decoder also includes a first audio decoder configured to operate in a first decoding mode using a first decoding technique and a second audio decoder configured to operate in a second decoding mode using a second decoding technique. The audio decoder also includes a pitch predictor integrated into the second audio decoder. The pitch predictor includes a long-term prediction filter and a short-term prediction filter. The audio decoder further includes a selector for selecting one of the at least three different decoding modes based on at least some of the control information.

    Abstract translation: 在一个实施例中,公开了一种用于对编码音频比特流进行解码的音频解码器。 音频解码器能够以至少三种不同的解码模式进行操作。 音频解码器包括用于从编码的音频比特流获得音频数据和控制信息的解复用器。 音频解码器还包括被配置为使用第一解码技术以第一解码模式操作的第一音频解码器和被配置为使用第二解码技术在第二解码模式下操作的第二音频解码器。 音频解码器还包括集成到第二音频解码器中的音调预测器。 音调预测器包括长期预测滤波器和短期预测滤波器。 音频解码器还包括选择器,用于基于至少一些控制信息来选择至少三种不同解码模式之一。

    ELECTRONIC DEVICES AND METHODS FOR COMPENSATING FOR ENVIRONMENTAL NOISE IN TEXT-TO-SPEECH APPLICATIONS
    84.
    发明申请
    ELECTRONIC DEVICES AND METHODS FOR COMPENSATING FOR ENVIRONMENTAL NOISE IN TEXT-TO-SPEECH APPLICATIONS 有权
    用于补充文字到语音应用程序中的环境噪声的电子设备和方法

    公开(公告)号:US20160275936A1

    公开(公告)日:2016-09-22

    申请号:US14374170

    申请日:2013-12-17

    Inventor: Ola Thorn

    Abstract: A method by an electronic device for compensating for environmental noise in text-to-speech (TTS) speech output includes: measuring environmental noise using a microphone signal; determining sound characteristics of the measured environmental noise; dynamically predicting expected future sound characteristics of the environmental noise based on the determined sound characteristics of the measured environmental noise; receiving a text input at a TTS engine at the device, with the TTS engine configured to convert the text input into a speech output signal; determining text characteristics of the text input at the TTS engine; and at the TTS engine, dynamically adapting the speech output signal based on the determined text characteristics of the text input and the predicted expected future sound characteristics of the environmental noise.

    Abstract translation: 一种用于补偿文本到语音(TTS)语音输出中的环境噪声的电子设备的方法包括:使用麦克风信号测量环境噪声; 确定所测量的环境噪声的声音特性; 基于所测定的环境噪声的声音特性动态预测环境噪声的预期未来声音特征; 在所述设备处的TTS引擎处接收文本输入,所述TTS引擎被配置为将所述文本输入转换为语音输出信号; 确定TTS引擎文本输入的文本特征; 并且在TTS引擎中,基于所确定的文本输入的文本特征和环境噪声的预测的未来预期声音特性来动态地调整语音输出信号。

    Audio encoder and decoder with multiple coding modes
    86.
    发明授权
    Audio encoder and decoder with multiple coding modes 有权
    具有多种编码模式的音频编码器和解码器

    公开(公告)号:US09396736B2

    公开(公告)日:2016-07-19

    申请号:US14936393

    申请日:2015-11-09

    Abstract: In one embodiment, an audio decoder for decoding an audio bitstream is disclosed. The decoder includes a first decoding module adapted to operate in a first coding mode and a second decoding module adapted to operate in a second coding mode, the second coding mode being different from the first coding mode. The decoder further includes a pitch filter in either the first coding mode or the second coding mode, the pitch filter adapted to filter a preliminary audio signal generated by the first decoding module or the second decoding module to obtain a filtered signal. The pitch filter is selectively enabled or disabled based on a value of a first parameter encoded in the audio bitstream, the first parameter being distinct from a second parameter encoded in the audio bitstream, the second parameter specifying a current coding mode of the audio decoder.

    Abstract translation: 在一个实施例中,公开了一种用于对音频比特流进行解码的音频解码器。 解码器包括适于以第一编码模式操作的第一解码模块和适于以第二编码模式操作的第二解码模块,第二编码模式不同于第一编码模式。 解码器还包括在第一编码模式或第二编码模式中的音调滤波器,音调滤波器适于滤除由第一解码模块或第二解码模块产生的初步音频信号以获得滤波信号。 基于在音频比特流中编码的第一参数的值,音调滤波器被选择性地启用或禁用,第一参数不同于在音频比特流中编码的第二参数,第二参数指定音频解码器的当前编码模式。

    Voice quality conversion system, voice quality conversion device, voice quality conversion method, vocal tract information generation device, and vocal tract information generation method
    88.
    发明授权
    Voice quality conversion system, voice quality conversion device, voice quality conversion method, vocal tract information generation device, and vocal tract information generation method 有权
    语音质量转换系统,语音质量转换装置,语音质量转换方法,声道信息生成装置和声道信息生成方法

    公开(公告)号:US09240194B2

    公开(公告)日:2016-01-19

    申请号:US13872183

    申请日:2013-04-29

    CPC classification number: G10L21/003 G10L13/033 G10L21/04 G10L25/15

    Abstract: A voice quality conversion system includes: an analysis unit which analyzes sounds of plural vowels of different types to generate first vocal tract shape information for each type of the vowels; a combination unit which combines, for each type of the vowels, the first vocal tract shape information on that type of vowel and the first vocal tract shape information on a different type of vowel to generate second vocal tract shape information on that type of vowel; and a synthesis unit which (i) combines vocal tract shape information on a vowel included in input speech and the second vocal tract shape information on the same type of vowel to convert vocal tract shape information on the input speech, and (ii) generates a synthetic sound using the converted vocal tract shape information and voicing source information on the input speech to convert the voice quality of the input speech.

    Abstract translation: 语音质量转换系统包括:分析单元,其分析不同类型的多个元音的声音,以生成每种类型的元音的第一声道形状信息; 组合单元,其对于每种类型的元音组合关于该类型的元音的第一声道形状信息和关于不同类型的元音的第一声道形状信息,以产生关于该类型的元音的第二声道形状信息; 以及合成单元,其(i)将包括在输入语音中的元音的声道形状信息与相同类型的元音的第二声道形状信息相结合,以在输入语音上转换声道形状信息,以及(ii)生成 使用转换的声道形状信息的合成声音和对输入语音的发声源信息来转换输入语音的语音质量。

    Method and system for non-parametric voice conversion
    89.
    发明授权
    Method and system for non-parametric voice conversion 有权
    非参数语音转换的方法和系统

    公开(公告)号:US09183830B2

    公开(公告)日:2015-11-10

    申请号:US14069510

    申请日:2013-11-01

    Applicant: Google Inc.

    Abstract: A method and system is disclosed for non-parametric speech conversion. A text-to-speech (TTS) synthesis system may include hidden Markov model (HMM) HMM based speech modeling for both synthesizing output speech. A converted HMM may be initially set to a source HMM trained with a voice of a source speaker. A parametric representation of speech may be extract from speech of a target speaker to generate a set of target-speaker vectors. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each HMM state of the source HMM to a target-speaker vector. The HMM states of the converted HMM may be replaced with the matched target-speaker vectors. Transforms may be applied to further adapt the converted HMM to the voice of target speaker. The converted HMM may be used to synthesize speech with voice characteristics of the target speaker.

    Abstract translation: 公开了用于非参数语音转换的方法和系统。 文本到语音(TTS)合成系统可以包括用于合成输出语音的隐马尔可夫模型(HMM)基于HMM的语音建模。 可以将经转换的HMM初始设置为用源扬声器的声音训练的源HMM。 可以从目标说话者的语音中提取语音的参数表示,以产生一组目标扬声器向量。 可以使用在补偿扬声器差异的变换下执行的匹配过程来将源HMM的每个HMM状态与目标扬声器向量相匹配。 转换的HMM的HMM状态可以用匹配的目标扬声器向量替换。 可以应用变换来进一步使转换的HMM适应目标扬声器的声音。 转换的HMM可以用于合成具有目标扬声器的语音特征的语音。

    Adaptive voice intelligibility processor
    90.
    发明授权
    Adaptive voice intelligibility processor 有权
    自适应语音清晰度处理器

    公开(公告)号:US09117455B2

    公开(公告)日:2015-08-25

    申请号:US13559450

    申请日:2012-07-26

    Abstract: Systems and methods for adaptively processing speech to improve voice intelligibility are described. These systems and methods can adaptively identify and track formant locations, thereby enabling formants to be emphasized as they change. As a result, these systems and methods can improve near-end intelligibility, even in noisy environments. The systems and methods can be implemented in Voice-over IP (VoIP) applications, telephone and/or video conference applications (including on cellular phones, smart phones, and the like), laptop and tablet communications, and the like. The systems and methods can also enhance non-voiced speech, which can include speech generated without the vocal track, such as transient speech.

    Abstract translation: 描述了用于自适应地处理语音以提高语音可懂度的系统和方法。 这些系统和方法可以自适应地识别和跟踪共振峰位置,从而使共振体在变化时被强调。 因此,即使在嘈杂的环境中,这些系统和方法也可以改善近端的清晰度。 系统和方法可以在IP语音(VoIP)应用,电话和/或视频会议应用(包括蜂窝电话,智能电话等),膝上型计算机和平板电脑等实现。 系统和方法还可以增强非语音语音,其可以包括没有声道的语音,例如瞬态语音。

Patent Agency Ranking