-
公开(公告)号:US20180336920A1
公开(公告)日:2018-11-22
申请号:US15842994
申请日:2017-12-15
IPC分类号: G10L25/84 , G10L21/0232 , G10L15/26 , G10L15/22 , G10L21/0208
CPC分类号: G10L25/84 , G10L15/02 , G10L15/20 , G10L15/22 , G10L15/26 , G10L17/02 , G10L21/0208 , G10L21/0232 , G10L2015/223 , G10L2021/02087
摘要: Methods forecast voice signal components, wherein processors are configured to translate the audio data that includes voice data and a fabricated background noise into frequency domain data; identify a threshold number of top frequencies within the frequency domain data; and generate a hash code value from the threshold number of top frequencies. Processors are configured to, in response to determining that the generated hash code value is unique from other hash code values that are indexed to each of a unique identification of the speaker and a background noise profile identification of the fabricated background noise, index a model of the threshold number of top frequencies in association with the hash code to the speaker identification and to the background noise profile.
-
公开(公告)号:US20180308504A1
公开(公告)日:2018-10-25
申请号:US16016976
申请日:2018-06-25
申请人: BOSE CORPORATION
发明人: Cristian M. Hera , Jeffery R. Vautin , Elie Bou Daher , Paraskevas Argyropoulos , Vigneish Kathavarayan
IPC分类号: G10L21/0232 , H03G5/16 , G10L25/78 , G10L21/0364 , H04R3/04 , H04R1/40 , H04R1/02 , G10L21/0208 , G10L21/0216
CPC分类号: G10L21/0232 , G10L21/0208 , G10L21/0364 , G10L25/78 , G10L2021/02082 , G10L2021/02087 , G10L2021/02166 , H03G5/165 , H04R3/005 , H04R2499/13 , H04S7/301
摘要: Audio systems and methods for providing intelligible audio content within a vehicle cabin. In one example, the audio system includes a first speaker to provide first audio content to a first seating position based on an audio signal received from an audio signal source, a second speaker to provide second audio content to a second seating position, a first microphone assembly positioned to detect speech content originating at the second seating position, leaked second audio content from the second speaker, and road noise, and audio signal processing circuitry configured to determine a perturbing signal based at least in part on a combination of the first speech content, the leaked second audio content, and the road noise, and adjust the audio signal to the first speaker to compensate for an effect of the perturbing signal on the first audio content at the first seating position.
-
公开(公告)号:US20180285059A1
公开(公告)日:2018-10-04
申请号:US15478090
申请日:2017-04-03
发明人: Robert Zurek , Amit Kumar Agrawal , Himanshu Chug
IPC分类号: G06F3/16 , H04L29/06 , H04L12/18 , G06F17/28 , G10L15/00 , G10L15/26 , G10L21/10 , G10L15/30 , G10L21/0272
CPC分类号: G06F3/165 , G06F17/275 , G06F17/279 , G06F17/289 , G10L15/005 , G10L21/0208 , G10L2021/02087 , H04L12/18 , H04L65/1089 , H04L65/1096 , H04L65/403 , H04M3/568 , H04M3/569 , H04M2203/2061
摘要: Apparatuses, methods, program products, and systems are disclosed for language-based muting during multiuser communications. A method includes determining, by use of a processor, a language of speech being spoken by a user of a plurality of users communicating over a network, comparing the determined language to one or more languages that each of the plurality of users has in common, and muting the speech in response to the determined language not matching a language of the one or more languages that each of the plurality of users has in common.
-
64.
公开(公告)号:US20180254052A1
公开(公告)日:2018-09-06
申请号:US15879635
申请日:2018-01-25
发明人: Eiiti HOSONO
IPC分类号: G10L21/0208 , H04W76/10 , H04M3/42
CPC分类号: G10L21/0208 , G10L2021/02087 , H03G3/32 , H04M3/42212 , H04M2203/2044 , H04M2203/2094 , H04W76/10 , Y02D70/00 , Y02D70/164
摘要: An input unit receives a request for transmission of a speech signal to a predetermined group. When the request for transmission is received in the input unit, a determination unit determines a process related to reproduction of the speech signal in a terminal device expected to receive a speech, based on a distance between terminal devices expected to receive the speech signal, excluding the terminal device outputting the request from a plurality of terminal devices belonging to the predetermined group. An output unit outputs the detail determined in the determination unit.
-
公开(公告)号:US20180122399A1
公开(公告)日:2018-05-03
申请号:US15120130
申请日:2015-03-02
IPC分类号: G10L21/0232 , G10L21/04 , G10L25/18 , H04R3/00
CPC分类号: G10L21/0232 , G10L21/0208 , G10L21/04 , G10L25/18 , G10L2021/02087 , G10L2021/02165 , G10L2021/02166 , H04R3/005
摘要: A noise suppressor comprises a first (401) and a second transformer (403) for generating a first and second frequency domain signal from a frequency transform of a first and second microphone signal. A gain unit (405, 407, 409) determines time frequency tile gains in response to a difference measure for magnitude time frequency tile values of the first frequency domain signal and magnitude time frequency tile values of the second frequency domain signal. A scaler (411) generates a third frequency domain signal by scaling time frequency tile values of the first frequency domain signal by the time frequency tile gains; and the resulting signal is converted to the time domain by a third transformer (413). A designator (405, 407, 415) designates time frequency tiles of the first frequency domain signal as speech tiles or noise tiles; and the gain unit (409) determines the gains in response to the designation of the time frequency tiles as speech tiles or noise tiles.
-
公开(公告)号:US20180059155A1
公开(公告)日:2018-03-01
申请号:US15645011
申请日:2017-07-10
申请人: FUJITSU LIMITED
发明人: Sayuri Nakayama , TARO TOGAWA , Takeshi OTANI
CPC分类号: G01R23/16 , G10L19/02 , G10L21/0232 , G10L2021/02087 , G10L2021/02165 , H03G3/32 , H03G5/165 , H04H60/04 , H04R1/222 , H04R3/005 , H04R2430/03
摘要: A sound processing device performs obtaining a first frequency spectrum that corresponds to a first sound signal and a second frequency spectrum that corresponds to a second sound signal, calculating a level difference between a level of each of frequency components in the first frequency spectrum and a level of each of frequency components in the second frequency spectrum, calculating a spread of a distribution of the level difference during a prescribed period for each of the frequency components, and determining a gain to be multiplied to the frequency component in the first frequency spectrum and a gain to be multiplied to the frequency component in the second frequency spectrum in accordance with the spread of the distribution of the level difference.
-
公开(公告)号:US20180041639A1
公开(公告)日:2018-02-08
申请号:US15667510
申请日:2017-08-02
发明人: David GUNAWAN , Glenn N. DICKINS
IPC分类号: H04M3/56 , H04L29/06 , G10L21/0208
CPC分类号: H04M3/569 , G10L21/02 , G10L21/0208 , G10L25/78 , G10L2021/02085 , G10L2021/02087 , H04L65/1089 , H04L65/403 , H04L65/4038 , H04L65/604
摘要: Systems and methods are described for modifying one of far-end signal playback and capture of local audio on an audio device. Frames of both a far-end audio stream and a near-end audio stream may be analyzed using a measure of voice activity, the analyzing producing voice data associated with each frame. Based on the voice data, a conference state may be determined, and one of playback of the far-end audio stream and capture of local audio on an audio device may be modified based on the determined conference state. By associating the likely intent with a predefined state, the device may further cull or remove unwanted or unlikely content from the device input and output. This may have a substantial advantage in allowing for full duplex operation in the case of more meaningful and continuing voice activity, particularly in the case where there are many connected endpoints.
-
公开(公告)号:US20170337924A1
公开(公告)日:2017-11-23
申请号:US15226527
申请日:2016-08-02
发明人: Dong Yu
IPC分类号: G10L17/04 , G10L17/18 , G10L19/022 , G10L21/0272
CPC分类号: G10L17/04 , G06K9/624 , G06K9/6246 , G10L17/18 , G10L19/022 , G10L21/0272 , G10L2021/02087
摘要: The techniques described herein improve methods to equip a computing device to conduct automatic speech recognition (“ASR”) in talker-independent multi-talker scenarios. In some examples, permutation invariant training of deep learning models can be used for talker-independent multi-talker scenarios. In some examples, the techniques can determine a permutation-considered assignment between a model's estimate of a source signal and the source signal. In some examples, the techniques can include training the model generating the estimate to minimize a deviation of the permutation-considered assignment. These techniques can be implemented into a neural network's structure itself, solving the label permutation problem that prevented making progress on deep learning based techniques for speech separation. The techniques discussed herein can also include source tracing to trace streams originating from a same source through the frames of a mixed signal.
-
公开(公告)号:US09800983B2
公开(公告)日:2017-10-24
申请号:US14807011
申请日:2015-07-23
发明人: Sylvie Wacquant , Michael Biemer
IPC分类号: H04R27/00 , H04R3/00 , G10L21/0208 , G10L21/0216 , H04R3/12 , H04S7/00 , G10K11/34
CPC分类号: H04R27/00 , G10K11/346 , G10K2210/1282 , G10L21/0208 , G10L2021/02087 , G10L2021/02166 , H04R3/005 , H04R3/12 , H04R2227/009 , H04R2420/03 , H04R2499/13 , H04S7/302
摘要: A sound system of a vehicle includes a plurality of microphones disposed in a cabin of the vehicle, a plurality of speakers disposed in the cabin of the vehicle, and a sound processor operable to process microphone output signals of the microphones to determine a voice signal of a speaking occupant in the vehicle at or near one of the microphones. The sound processor generates a processor output signal that is provided to at least some of the speakers. Responsive to the processor output signal, the at least some of the speakers generate sound representative of the voice signal of the speaking occupant to direct the sound towards other occupants in the vehicle, while one or more speakers at or near the seat occupied by the speaking occupant do not generate sound representative of the voice signal of the speaking occupant.
-
公开(公告)号:US09767819B2
公开(公告)日:2017-09-19
申请号:US14775830
申请日:2013-04-11
发明人: Markus Buck , Tim Haulick , Tobias Wolff , Suhadi Suhadi
IPC分类号: G10L21/02 , G10L21/0208 , H04M9/08 , G10L15/20 , G10L15/22
CPC分类号: G10L21/0208 , G10L15/20 , G10L15/22 , G10L2015/223 , G10L2021/02082 , G10L2021/02087 , H04M9/082
摘要: In one aspect, the present application is directed to a device for providing different levels of sound quality in an audio entertainment system. The device includes a speech enhancement system with a reference signal modification unit and a plurality of acoustic echo cancellation filters. Each acoustic echo cancellation filter is coupled to a playback channel. The device includes an audio playback system with loudspeakers. Each loudspeaker is coupled to a playback channel. At least one of the speech enhancement system and the audio playback system operates according to a full sound quality mode and a reduced sound quality mode. In the full sound quality mode, all of the playback channels contain non-zero output signals. In the reduced sound quality mode, a first subset of the playback channels contains non-zero output signals and a second subset of the playback channels contains zero output signals.
-
-
-
-
-
-
-
-
-