专利检索 ipc:G10L21/0224 第 1 页

1.

发明公开
LOW-LATENCY NOISE SUPPRESSION 审中-公开

公开(公告)号：US20240331716A1

公开(公告)日：2024-10-03

申请号：US18611308

申请日：2024-03-20

申请人： QUALCOMM Incorporated

发明人： Jacob Jon BEAN , Rogerio Guedes ALVES , Vahid MONTAZERI , Erik VISSER

IPC分类号： G10L21/0224 , G06F1/16 , G10L21/0232 , G10L21/0216

CPC分类号： G10L21/0224 , G06F1/163 , G10L21/0232 , G10L2021/02166

摘要： A device includes one or more processors configured to obtain audio data representing one or more audio signals. The audio data includes a first segment and a second segment subsequent to the first segment. The one or more processors are configured to perform one or more transform operations on the first segment to generate frequency-domain audio data. The one or more processors are configured to provide input data based on the frequency-domain audio data as input to one or more machine-learning models to generate a noise-suppression output. The one or more processors are configured to perform one or more reverse transform operations on the noise-suppression output to generate time-domain filter coefficients. The one or more processors are configured to perform time-domain filtering of the second segment using the time-domain filter coefficients to generate a noise-suppressed output signal.

2.

发明公开
SYSTEM AND METHOD FOR MASK-BASED NEURAL BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT 审中-公开

公开(公告)号：US20240331715A1

公开(公告)日：2024-10-03

申请号：US18457921

申请日：2023-08-29

申请人： Samsung Electronics Co., Ltd.

发明人： Ching-Hua Lee , Chou-Chang Yang , Yilin Shen , Hongxia Jin

IPC分类号： G10L21/0224

CPC分类号： G10L21/0224 , G10L2021/02166

摘要： A method includes receiving, during a first time window, a set of noisy audio signals from a plurality of audio input devices. The method also includes generating a noisy time-frequency representation based on the set of noisy audio signals. The method further includes providing the noisy time-frequency representation as an input to a mask estimation model trained to output a mask used to predict a clean time-frequency representation of clean speech audio from the noisy time-frequency representation. The method also includes determining beamforming filter weights based on the mask. The method further includes applying the beamforming filter weights to the noisy time-frequency representation to isolate the clean speech audio from the set of noisy audio signals. In addition, the method includes outputting the clean speech audio.

3.

发明公开
AUDIO SEPARATION METHOD AND ELECTRONIC DEVICE FOR PERFORMING THE SAME 审中-公开

公开(公告)号：US20240274147A1

公开(公告)日：2024-08-15

申请号：US18432572

申请日：2024-02-05

申请人： SAMSUNG ELECTRONICA CO., LTD.

发明人： Deokjun EOM , Kyungrae KIM , Woohyun NAM , Jungkyu KIM

IPC分类号： G10L21/0272 , G10L21/0224 , G10L21/0232 , G10L21/14

CPC分类号： G10L21/0272 , G10L21/0224 , G10L21/0232 , G10L21/14

摘要： Provided is a method for separating one or more candidate audios in a sound source including the one or more candidate audios and a background sound, by using an audio separation system, the method including extracting a first audio feature from the sound source, extracting a background sound feature from the sound source, the background sound feature identifying a degree of association between the first audio feature and the background sound, generating a second audio feature based on the first audio feature, the background sound feature, and a background sound control parameter configured to control the background sound and generating one or more separated audios based on target information corresponding to the one or more candidate audios, the first audio feature, and the second audio feature in which the background sound is adjusted.

4.

发明授权
Method and apparatus for post-processing audio signal, storage medium, and electronic device 有权

公开(公告)号：US12002484B2

公开(公告)日：2024-06-04

申请号：US17736797

申请日：2022-05-04

申请人： TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

发明人： Yang Yu , Yu Chen

IPC分类号： G10L21/0232 , G10L21/0208 , G10L21/0224 , H04R3/00

CPC分类号： G10L21/0232 , G10L21/0208 , G10L21/0224 , H04R3/002

摘要： This application discloses a method and an apparatus for processing an audio signal. The method includes obtaining a first speech signal acquired by a first device; performing frame blocking on the first speech signal, to obtain multiple speech signal frames; converting the multiple speech signal frames into multiple first frequency domain signal frames; performing aliasing processing on a first sub-frequency domain signal frame among the multiple first frequency domain signal frames with a frequency lower than or equal to a target frequency threshold, and retaining a second sub-frequency domain signal frame among the multiple first frequency domain signal frames with a frequency higher than the target frequency threshold, to obtain multiple second frequency domain signal frames, the target frequency threshold being related to a sampling frequency of a second device; and performing frame fusion on the multiple second frequency domain signal frames, to obtain a second speech signal.

5.

发明授权
Selection of quantization schemes for spatial audio parameter encoding 有权

公开(公告)号：US11996109B2

公开(公告)日：2024-05-28

申请号：US18146151

申请日：2022-12-23

申请人： Nokia Technologies Oy

发明人： Adriana Vasilache

IPC分类号： G10L19/038 , G10L19/022 , G10L21/0224 , G10L21/0232 , G10L19/00

CPC分类号： G10L19/038 , G10L19/022 , G10L21/0224 , G10L21/0232 , G10L2019/0001

摘要： There is disclosed inter alia an apparatus for spatial audio signal encoding comprising means for receiving for each time frequency block of a sub band of an audio frame a spatial audio parameter comprising an azimuth and an elevation; determining a first distortion measure for the audio frame by determining a first distance measure for each time frequency block and summing the first distance measure for each time frequency block; determining a second distortion measure for the audio frame by determining a second distance measure for each time frequency block and summing the second distance measure for each time frequency block, and selecting either the first quantization scheme or the second quantization scheme for quantising the elevation and the azimuth for all time frequency blocks of the sub band of the audio frame, wherein the selecting is dependent on the first and second distortion measures.

6.

发明授权
Operation device 有权

公开(公告)号：US11984133B2

公开(公告)日：2024-05-14

申请号：US17775274

申请日：2020-10-29

申请人： Sony Interactive Entertainment Inc.

发明人： Takashi Enokihara , Kojiro Matsuyama , Takafumi Ogura , Tetsuro Horikawa , Shin Kimura , Nobuyuki Kihara

IPC分类号： G10L21/0224 , A63F13/215 , A63F13/285 , A63F13/54 , G06F3/01 , G06F3/033 , G06F3/0338 , G10L21/0208 , G10L21/0216 , G10L21/0232 , H04R1/08 , H04R3/00

CPC分类号： G10L21/0224 , A63F13/54 , G06F3/016 , G10L21/0232 , H04R1/08 , H04R3/00 , G10L2021/02082 , G10L2021/02163

摘要： Disclosed is an operation device including an interaction member, a microphone, a control circuit, and an audio signal processing circuit. The interaction member is used for interacting with a user. The control circuit periodically acquires scan data indicating the acting status of the interaction member. The audio signal processing circuit executes a noise removal process of removing noise from a collected audio signal collected by the microphone. The control circuit periodically transmits previously acquired scan data to the audio signal processing circuit. The audio signal processing circuit executes the noise removal process by using the scan data transmitted from the control circuit.

7.

发明授权
Audio recognition method, method, apparatus for positioning target audio, and device 有权

公开(公告)号：US11967316B2

公开(公告)日：2024-04-23

申请号：US17183209

申请日：2021-02-23

申请人： Tencent Technology (Shenzhen) Company Limited

发明人： Jimeng Zheng , Ian Ernan Liu , Yi Gao , Weiwei Li

IPC分类号： G10L15/22 , G01S3/80 , G01S3/802 , G10L15/08 , G10L15/20 , G10L21/0224 , G10L21/0232 , G10L25/51 , G10L21/0208 , G10L21/0216

CPC分类号： G10L15/20 , G01S3/8006 , G01S3/802 , G10L15/08 , G10L15/22 , G10L21/0224 , G10L21/0232 , G10L25/51 , G10L2015/088 , G10L2021/02082 , G10L2021/02166

摘要： Embodiments of this application disclose method and apparatus for positioning a target audio signal by an audio interaction device, and an audio interaction device The method includes: obtaining audio signals in a plurality of directions in a space, and performing echo cancellation on the audio signal, the audio signal including a target-audio direct signal; obtaining weights of a plurality of time-frequency points in the audio signals, a weight of each time-frequency point indicating, at the time-frequency point, a relative proportion of the target-audio direct signal in the audio signals; weighting time-frequency components of the audio signal at the plurality of time-frequency points separately for each of the plurality of directions by using the weights of the plurality of time-frequency points, to obtain a weighted audio signal energy distribution; and obtaining a sound source azimuth corresponding to the target-audio direct signal in the audio signals accordingly.

8.

发明授权
Ambient cooperative intelligence system and method 有权

公开(公告)号：US11967305B2

公开(公告)日：2024-04-23

申请号：US17840958

申请日：2022-06-15

申请人： Nuance Communications, Inc.

发明人： Dushyant Sharma , Patrick A. Naylor , Joel Praveen Pinto , Daniel Paulino Almendro Barreda

IPC分类号： G10L13/02 , G06F3/16 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/06 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/00

CPC分类号： G10L13/02 , G06F3/165 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/063 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/30 , H04S7/302 , H04S7/303

摘要： A method, computer program product, and computing system for generating a three-dimensional model of at least a portion of a three-dimensional space incorporating an ACI system via a video recording subsystem of an ACI calibration platform; and generating one or more audio calibration signals for receipt by an audio recording system included within the ACI system via an audio generation subsystem of the ACI calibration platform.

9.

发明授权
Neural-network-based-implemented ophthalmologic intelligent consultation method and apparatus 有权

公开(公告)号：US11955240B1

公开(公告)日：2024-04-09

申请号：US18465977

申请日：2023-09-12

申请人： Renmin Hospital of Wuhan University (Hubei General Hospital)

发明人： Xuan Xiao , Xiang Gao , Ting Chen , Ting Su , Xuejie Li

IPC分类号： G06T7/00 , A61B3/00 , A61B3/14 , A61B5/00 , G06T5/30 , G06T5/40 , G06T7/12 , G06V10/77 , G06V10/82 , G06V20/70 , G10L15/02 , G10L15/05 , G10L15/16 , G10L15/18 , G10L15/22 , G10L21/0224 , G10L25/18 , G10L25/21 , G16H50/20

CPC分类号： G16H50/20 , A61B3/0025 , A61B3/14 , A61B5/4803 , G06T5/30 , G06T5/40 , G06T7/0012 , G06T7/12 , G06V10/7715 , G06V10/82 , G06V20/70 , G10L15/02 , G10L15/05 , G10L15/16 , G10L15/1815 , G10L15/22 , G10L21/0224 , G10L25/18 , G10L25/21 , G06T2207/20084 , G06T2207/30041 , G06T2207/30096 , G06T2207/30101 , G06V2201/03 , G10L2015/025

摘要： A neural-network-based-implemented ophthalmologic intelligent consultation method includes: performing correction filtering on a consultation voice of a patient, framing the voice into a consultation voice frame sequence, generating a consultation text corresponding to the consultation voice frame sequence based on phoneme recognition and phoneme transcoding, and extracting an ophthalmologically-described disease; performing gray-level filtering, primary picture segmentation, and size equalization operation on an eye picture set of the to-be-diagnosed patient to acquire a standard eyeball picture group; extracting eye white features, pupil features and blood vessel features from the standard eyeball picture group, performing lesion feature analysis on the eye white features, the pupil features and the blood vessel features to acquire an ophthalmologically-observed disease, and based on the ophthalmologically-observed disease and the ophthalmologically-described disease, generating a consultation result.

10.

发明公开
Method and System for Reverberation Modeling of Speech Signals 审中-公开

公开(公告)号：US20240055012A1

公开(公告)日：2024-02-15

申请号：US17819654

申请日：2022-08-15

申请人： Mitsubishi Electric Research Laboratories, Inc.

发明人： Zhong-Qiu Wang , Gordon Wichern , Jonathan Le Roux

IPC分类号： G10L21/0232 , G10L15/06 , G10L25/30 , G10L21/0224

CPC分类号： G10L21/0232 , G10L15/063 , G10L25/30 , G10L21/0224 , G10L2021/02082

摘要： A system and method for reverberation reduction is disclosed. A first Deep Neural Network (DNN) produces a first estimate of a target direct-path signal from a mixture of acoustic signals that include the target direct-path signal and a reverberation of the target direct-path signal. A filter modeling a room impulse response (RIR) for the first estimate is estimated. The filter when applied to the first estimate of the target direct-path signal generates a result closest to a residual between the mixture of the acoustic signals and the first estimate of the target direct-path signal according to a distance function. The estimated filter is used for modeling the RIR.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类