Patent search ipc:"G10L21/003" Page 9

81.

发明申请
METHOD AND SYSTEM FOR COMMUNICATING WITH A USER IMMERSED IN A VIRTUAL REALITY ENVIRONMENT 审中-公开
Title translation: 与虚拟现实环境中的用户通信的方法和系统

公开(公告)号：US20170068508A1

公开(公告)日：2017-03-09

申请号：US15249664

申请日：2016-08-29

Applicant: Nokia Technologies Oy

Inventor： Francesco Cricri , Jussi Artturi Leppänen , Antti Johannes Eronen , Arto Juhani Lehtiniemi

IPC: G06F3/16 , G06F3/0481 , G10L21/003 , G06F3/01

CPC classification number: G06F3/167 , G06F3/017 , H04S7/30 , H04S7/304 , H04S2400/11

Abstract: A method comprising: receiving a request to create a virtual communication channel between the real world and a virtual reality environment, the virtual reality environment comprising both audio and visual content; in response to receiving the request, causing a virtual window to be displayed in the virtual reality environment; and causing distorted audio from real world surroundings of a user making the request to emanate from the virtual window.

Abstract translation: 一种方法，包括：接收在现实世界和虚拟现实环境之间创建虚拟通信信道的请求，所述虚拟现实环境包括音频和视觉内容; 响应于接收到请求，使虚拟窗口在虚拟现实环境中显示; 并使来自虚拟窗口的请求的用户的真实环境中的音频变形。

82.

发明授权
Method and system for adjusting user speech in a communication session 有权
Title translation: 在通信会话中调整用户语音的方法和系统

公开(公告)号：US09558756B2

公开(公告)日：2017-01-31

申请号：US14065903

申请日：2013-10-29

Applicant: AT&T Intellectual Property I, LP

Inventor： Randolph Wohlert , Aaron Bangor , Mark Stockert

IPC: G10L21/02 , G10L21/003 , G10L15/00 , G10L13/033

CPC classification number: G09B21/00 , G09B21/006 , G09B21/009 , G10L13/02 , G10L13/033 , G10L15/00 , G10L15/02 , G10L21/003 , G10L21/02 , G10L21/0364 , G10L21/10 , H04L67/22 , H04L67/306

Abstract: A system that incorporates the subject disclosure may include, for example, receive user speech captured at a second end user device during a communication session between the second end user device and a first end user device, apply speech recognition to the user speech, identify an unclear word in the user speech based on the speech recognition, adjust the user speech to generate adjusted user speech by replacing all or a portion of the unclear word with replacement audio content, and provide the adjusted user speech to the first end user device during the communication session. Other embodiments are disclosed.

Abstract translation: 结合本发明的系统可以包括例如在第二最终用户设备和第一终端用户设备之间的通信会话期间接收在第二最终用户设备处捕获的用户语音，对用户语音应用语音识别，基于语音识别的用户语音中的不清楚的单词，通过用替换的音频内容替换全部或一部分不清楚的单词来调整用户语音以产生调整后的用户语音，并且在该时间段期间向第一终端用户设备提供经调整的用户语音沟通会话公开了其他实施例。

83.

发明授权
Audio encoder and decoder with pitch prediction 有权
Title translation: 具有音调预测的音频编码器和解码器

公开(公告)号：US09558754B2

公开(公告)日：2017-01-31

申请号：US15097201

申请日：2016-04-12

Applicant: DOLBY INTERNATIONAL AB

Inventor： Barbara Resch , Kristofer Kjörling , Lars Villemoes

IPC: G10L21/00 , G10L19/26 , G10L19/107 , G10L19/20 , G10L19/12 , G10L19/125 , G10L21/003 , G10L19/09 , G10L21/013 , G10L19/22 , G10L21/007 , G10L19/032 , G10L19/02

CPC classification number: G10L19/26 , G10L19/02 , G10L19/0212 , G10L19/032 , G10L19/09 , G10L19/107 , G10L19/12 , G10L19/125 , G10L19/20 , G10L19/22 , G10L19/265 , G10L21/003 , G10L21/007 , G10L21/013

Abstract: In one embodiment, an audio decoder for decoding an encoded audio bitstream is disclosed. The audio decoder is capable of being operated in at least three different decoding modes. The audio decoder includes a demultiplexer for obtaining audio data and control information from the encoded audio bitstream. The audio decoder also includes a first audio decoder configured to operate in a first decoding mode using a first decoding technique and a second audio decoder configured to operate in a second decoding mode using a second decoding technique. The audio decoder also includes a pitch predictor integrated into the second audio decoder. The pitch predictor includes a long-term prediction filter and a short-term prediction filter. The audio decoder further includes a selector for selecting one of the at least three different decoding modes based on at least some of the control information.

Abstract translation: 在一个实施例中，公开了一种用于对编码音频比特流进行解码的音频解码器。音频解码器能够以至少三种不同的解码模式进行操作。音频解码器包括用于从编码的音频比特流获得音频数据和控制信息的解复用器。音频解码器还包括被配置为使用第一解码技术以第一解码模式操作的第一音频解码器和被配置为使用第二解码技术在第二解码模式下操作的第二音频解码器。音频解码器还包括集成到第二音频解码器中的音调预测器。音调预测器包括长期预测滤波器和短期预测滤波器。音频解码器还包括选择器，用于基于至少一些控制信息来选择至少三种不同解码模式之一。

84.

发明申请
ELECTRONIC DEVICES AND METHODS FOR COMPENSATING FOR ENVIRONMENTAL NOISE IN TEXT-TO-SPEECH APPLICATIONS 有权
Title translation: 用于补充文字到语音应用程序中的环境噪声的电子设备和方法

公开(公告)号：US20160275936A1

公开(公告)日：2016-09-22

申请号：US14374170

申请日：2013-12-17

Applicant: Sony Corporation

Inventor： Ola Thorn

IPC: G10L13/033 , G10L15/06 , G10L21/0216 , G10L13/047 , G10L13/08 , G10L21/003

CPC classification number: G10L13/0335 , G10L13/047 , G10L13/08 , G10L15/063 , G10L21/003 , G10L21/0216 , G10L2015/0631

Abstract: A method by an electronic device for compensating for environmental noise in text-to-speech (TTS) speech output includes: measuring environmental noise using a microphone signal; determining sound characteristics of the measured environmental noise; dynamically predicting expected future sound characteristics of the environmental noise based on the determined sound characteristics of the measured environmental noise; receiving a text input at a TTS engine at the device, with the TTS engine configured to convert the text input into a speech output signal; determining text characteristics of the text input at the TTS engine; and at the TTS engine, dynamically adapting the speech output signal based on the determined text characteristics of the text input and the predicted expected future sound characteristics of the environmental noise.

Abstract translation: 一种用于补偿文本到语音（TTS）语音输出中的环境噪声的电子设备的方法包括：使用麦克风信号测量环境噪声; 确定所测量的环境噪声的声音特性; 基于所测定的环境噪声的声音特性动态预测环境噪声的预期未来声音特征; 在所述设备处的TTS引擎处接收文本输入，所述TTS引擎被配置为将所述文本输入转换为语音输出信号; 确定TTS引擎文本输入的文本特征; 并且在TTS引擎中，基于所确定的文本输入的文本特征和环境噪声的预测的未来预期声音特性来动态地调整语音输出信号。

85.

发明申请
REAL-TIME REMODELING OF USER VOICE IN AN IMMERSIVE VISUALIZATION SYSTEM 有权
Title translation: 用户声音在实时可视化系统中实时重构

公开(公告)号：US20160260441A1

公开(公告)日：2016-09-08

申请号：US14641174

申请日：2015-03-06

Applicant: Andrew Frederick Muehlhausen , Matthew Johnston , Kasson Crooker

Inventor： Andrew Frederick Muehlhausen , Matthew Johnston , Kasson Crooker

IPC: G10L21/034 , G10L21/003 , G06F3/01 , G06T19/00 , G02B27/01

CPC classification number: G10L21/0202 , A63F13/215 , A63F13/54 , G02B27/017 , G02B27/0172 , G02B2027/0138 , G02B2027/014 , G02B2027/0178 , G06F3/011 , G06F3/012 , G06T17/05 , G06T19/006 , G10L21/003 , G10L21/02 , G10L21/034 , H04S7/304 , H04S2400/15 , H04S2420/01

Abstract: A visualization system with audio capability includes one or more display devices, one or more microphones, one or more speakers, and audio processing circuitry. While a display device displays a holographic image to a user, a microphone inputs an utterance of the user, or a sound from the user's environment, and provides it to the audio processing circuitry. The audio processing circuitry processes the utterance (or other sound) in real-time to add an audio effect associated with the image to increase realism, and outputs the processed utterance (or other sound) to the user via the speaker in real-time, with very low latency.

Abstract translation: 具有音频功能的可视化系统包括一个或多个显示设备，一个或多个麦克风，一个或多个扬声器和音频处理电路。当显示设备向用户显示全息图像时，麦克风输入用户的话语或来自用户环境的声音，并将其提供给音频处理电路。音频处理电路实时地处理话音（或其他声音）以添加与图像相关联的音频效果以增加现实感，并且经由扬声器实时地将处理的话语（或其他声音）输出给用户，具有非常低的延迟。

86.

发明授权
Audio encoder and decoder with multiple coding modes 有权
Title translation: 具有多种编码模式的音频编码器和解码器

公开(公告)号：US09396736B2

公开(公告)日：2016-07-19

申请号：US14936393

申请日：2015-11-09

Applicant: DOLBY INTERNATIONAL AB

Inventor： Barbara Resch , Kristofer Kjörling , Lars Villemoes

IPC: G10L19/00 , G10L19/26 , G10L19/20 , G10L19/12 , G10L19/125 , G10L21/003 , G10L19/02 , G10L19/107

CPC classification number: G10L19/26 , G10L19/02 , G10L19/0212 , G10L19/032 , G10L19/09 , G10L19/107 , G10L19/12 , G10L19/125 , G10L19/20 , G10L19/22 , G10L19/265 , G10L21/003 , G10L21/007 , G10L21/013

Abstract: In one embodiment, an audio decoder for decoding an audio bitstream is disclosed. The decoder includes a first decoding module adapted to operate in a first coding mode and a second decoding module adapted to operate in a second coding mode, the second coding mode being different from the first coding mode. The decoder further includes a pitch filter in either the first coding mode or the second coding mode, the pitch filter adapted to filter a preliminary audio signal generated by the first decoding module or the second decoding module to obtain a filtered signal. The pitch filter is selectively enabled or disabled based on a value of a first parameter encoded in the audio bitstream, the first parameter being distinct from a second parameter encoded in the audio bitstream, the second parameter specifying a current coding mode of the audio decoder.

Abstract translation: 在一个实施例中，公开了一种用于对音频比特流进行解码的音频解码器。解码器包括适于以第一编码模式操作的第一解码模块和适于以第二编码模式操作的第二解码模块，第二编码模式不同于第一编码模式。解码器还包括在第一编码模式或第二编码模式中的音调滤波器，音调滤波器适于滤除由第一解码模块或第二解码模块产生的初步音频信号以获得滤波信号。基于在音频比特流中编码的第一参数的值，音调滤波器被选择性地启用或禁用，第一参数不同于在音频比特流中编码的第二参数，第二参数指定音频解码器的当前编码模式。

87.

发明授权
Improving voice communication over a network 有权

公开(公告)号：US09286889B2

公开(公告)日：2016-03-15

申请号：US13752503

申请日：2013-01-29

Applicant: International Business Machines Corporation

Inventor： Dimitri Kanevsky , Pamela A. Nesbitt , Tara N. Sainath , Elizabeth V. Woodward

IPC: G10L21/00 , G10L21/02 , G10L15/08 , G10L15/18 , G10L15/26 , G10L21/003 , G10L21/057 , G10L25/60 , H04M3/56

CPC classification number: G10L15/08 , G10L15/18 , G10L15/26 , G10L21/003 , G10L21/02 , G10L21/057 , G10L25/60 , H04M3/56

Abstract: Systems and methods for improving communication over a network are provided. A system for improving communication over a network, comprises a detection module capable of detecting data indicating a problem with a communication between at least two participants communicating via communication devices over the network, a management module capable of analyzing the data to determine whether a participant is dissatisfied with the communication, wherein the management module includes a determining module capable of determining that the participant is dissatisfied, and identifying an event causing the dissatisfaction, and a resolution module capable of providing a solution for eliminating the problem.

88.

发明授权
Voice quality conversion system, voice quality conversion device, voice quality conversion method, vocal tract information generation device, and vocal tract information generation method 有权
Title translation: 语音质量转换系统，语音质量转换装置，语音质量转换方法，声道信息生成装置和声道信息生成方法

公开(公告)号：US09240194B2

公开(公告)日：2016-01-19

申请号：US13872183

申请日：2013-04-29

Applicant: Panasonic Corporation

Inventor： Takahiro Kamai , Yoshifumi Hirose

IPC: G10L13/00 , G10L21/003 , G10L25/15 , G10L21/04 , G10L13/033

CPC classification number: G10L21/003 , G10L13/033 , G10L21/04 , G10L25/15

Abstract: A voice quality conversion system includes: an analysis unit which analyzes sounds of plural vowels of different types to generate first vocal tract shape information for each type of the vowels; a combination unit which combines, for each type of the vowels, the first vocal tract shape information on that type of vowel and the first vocal tract shape information on a different type of vowel to generate second vocal tract shape information on that type of vowel; and a synthesis unit which (i) combines vocal tract shape information on a vowel included in input speech and the second vocal tract shape information on the same type of vowel to convert vocal tract shape information on the input speech, and (ii) generates a synthetic sound using the converted vocal tract shape information and voicing source information on the input speech to convert the voice quality of the input speech.

Abstract translation: 语音质量转换系统包括：分析单元，其分析不同类型的多个元音的声音，以生成每种类型的元音的第一声道形状信息; 组合单元，其对于每种类型的元音组合关于该类型的元音的第一声道形状信息和关于不同类型的元音的第一声道形状信息，以产生关于该类型的元音的第二声道形状信息; 以及合成单元，其（i）将包括在输入语音中的元音的声道形状信息与相同类型的元音的第二声道形状信息相结合，以在输入语音上转换声道形状信息，以及（ii）生成使用转换的声道形状信息的合成声音和对输入语音的发声源信息来转换输入语音的语音质量。

89.

发明授权
Method and system for non-parametric voice conversion 有权
Title translation: 非参数语音转换的方法和系统

公开(公告)号：US09183830B2

公开(公告)日：2015-11-10

申请号：US14069510

申请日：2013-11-01

Applicant: Google Inc.

Inventor： Ioannis Agiomyrgiannakis

IPC: G10L15/00 , G10L15/04 , G10L15/14 , G10L13/02 , G10L21/003 , G10L15/07 , G10L13/033 , G10L15/26

CPC classification number: G10L13/02 , G10L13/0335 , G10L15/07 , G10L15/144 , G10L15/26 , G10L21/003

Abstract: A method and system is disclosed for non-parametric speech conversion. A text-to-speech (TTS) synthesis system may include hidden Markov model (HMM) HMM based speech modeling for both synthesizing output speech. A converted HMM may be initially set to a source HMM trained with a voice of a source speaker. A parametric representation of speech may be extract from speech of a target speaker to generate a set of target-speaker vectors. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each HMM state of the source HMM to a target-speaker vector. The HMM states of the converted HMM may be replaced with the matched target-speaker vectors. Transforms may be applied to further adapt the converted HMM to the voice of target speaker. The converted HMM may be used to synthesize speech with voice characteristics of the target speaker.

Abstract translation: 公开了用于非参数语音转换的方法和系统。文本到语音（TTS）合成系统可以包括用于合成输出语音的隐马尔可夫模型（HMM）基于HMM的语音建模。可以将经转换的HMM初始设置为用源扬声器的声音训练的源HMM。可以从目标说话者的语音中提取语音的参数表示，以产生一组目标扬声器向量。可以使用在补偿扬声器差异的变换下执行的匹配过程来将源HMM的每个HMM状态与目标扬声器向量相匹配。转换的HMM的HMM状态可以用匹配的目标扬声器向量替换。可以应用变换来进一步使转换的HMM适应目标扬声器的声音。转换的HMM可以用于合成具有目标扬声器的语音特征的语音。

90.

发明授权
Adaptive voice intelligibility processor 有权
Title translation: 自适应语音清晰度处理器

公开(公告)号：US09117455B2

公开(公告)日：2015-08-25

申请号：US13559450

申请日：2012-07-26

Applicant: James Tracey , Daekyong Noh , Xing He

Inventor： James Tracey , Daekyong Noh , Xing He

IPC: G10L25/90 , G10L25/93 , G10L25/00 , G10L21/00 , G10L19/12 , G10L19/02 , G10L21/02 , G10L15/00 , G10L15/20 , H03G3/20 , H04R25/00 , H04B15/00 , G10L21/003 , G10L21/0316 , G10L21/0364 , G10L19/07 , G10L25/15

CPC classification number: G10L21/003 , G10L19/07 , G10L21/0316 , G10L21/0364 , G10L25/15

Abstract: Systems and methods for adaptively processing speech to improve voice intelligibility are described. These systems and methods can adaptively identify and track formant locations, thereby enabling formants to be emphasized as they change. As a result, these systems and methods can improve near-end intelligibility, even in noisy environments. The systems and methods can be implemented in Voice-over IP (VoIP) applications, telephone and/or video conference applications (including on cellular phones, smart phones, and the like), laptop and tablet communications, and the like. The systems and methods can also enhance non-voiced speech, which can include speech generated without the vocal track, such as transient speech.

Abstract translation: 描述了用于自适应地处理语音以提高语音可懂度的系统和方法。这些系统和方法可以自适应地识别和跟踪共振峰位置，从而使共振体在变化时被强调。因此，即使在嘈杂的环境中，这些系统和方法也可以改善近端的清晰度。系统和方法可以在IP语音（VoIP）应用，电话和/或视频会议应用（包括蜂窝电话，智能电话等），膝上型计算机和平板电脑等实现。系统和方法还可以增强非语音语音，其可以包括没有声道的语音，例如瞬态语音。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification