专利检索 ipc:"G10L21/0308" 第 1 页

1.

发明公开
SIGNAL PROCESSING DEVICE, SIGNAL PROCESSING METHOD, AND SIGNAL PROCESSING PROGRAM 审中-公开

公开(公告)号：US20240274149A1

公开(公告)日：2024-08-15

申请号：US18561727

申请日：2021-05-25

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Hiroshi SATO , Tsubasa OCHIAI , Marc DELCROIX , Keisuke KINOSHITA

IPC分类号： G10L21/0308 , G10L21/0208 , G10L25/84

CPC分类号： G10L21/0308 , G10L21/0208 , G10L25/84

摘要： A voice recognition input determination unit includes SIR-SNR acquisition circuitry that acquires, from a mixed voice in which a voice of another speaker overlaps with a voice of a target speaker, at least one of the mixed voice and a set of a signal-to-interference ratio (SIR) that is a ratio of a target voice to an interference speaker voice in the mixed voice and a signal-to-noise ratio (SNR) that is a ratio of the target voice to a noise in the mixed voice. Further, there is determination circuitry that determines a voice based on at least one of the mixed voice and an enhanced voice obtained by enhancing the mixed voice as a voice to be used for voice recognition on the basis of at least one of the mixed voice and the set of the SIR and the SNR.

2.

发明公开
INFORMATION PROCESSING APPARATUS, SIGNAL PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM 审中-公开

公开(公告)号：US20240265934A1

公开(公告)日：2024-08-08

申请号：US18563940

申请日：2022-02-17

申请人： Sony Group Corporation

发明人： Yuichiro KOYAMA , Michael HENTSCHEL , Kan KURODA , Masanobu NAKAMURA , Hiroaki OGAWA , Kentaro SHIBATA , Takashi SHIBUYA , Noriko TOTSUKA , Emiru TSUNOO , Toshimitsu UESAKA , Keiichi YAMADA

IPC分类号： G10L21/10 , G10L21/0308

CPC分类号： G10L21/10 , G10L21/0308

摘要： An information processing apparatus according to an embodiment of the present technology includes a signal processing unit. The signal processing unit extracts, from a plurality of observation signals obtained by a group of microphones, voice signals respectively related to the microphones by machine learning. This allows desired signal output. Moreover, a person on the other end of the line can easily hear the voice of a speaker and cannot hear the voice of other speaker, thereby providing a feeling of security and high confidentiality for users.

3.

发明公开
SYSTEMS AND METHODS FOR REMOTE REAL-TIME AUDIO MONITORING 审中-公开

公开(公告)号：US20240257823A1

公开(公告)日：2024-08-01

申请号：US18521676

申请日：2023-11-28

申请人： MIXHalo Corp.

发明人： Carlos J. Morales Batista

IPC分类号： G10L21/0232 , G08B21/18 , G10L19/022 , G10L21/0308

CPC分类号： G10L21/0232 , G08B21/182 , G10L19/022 , G10L21/0308

摘要： A method for remotely monitoring audio signal variance in real-time by a cloud-based virtual host communicative coupled to an audio server computing device includes receiving and processing network packets that contain an audio signal. The method also includes calculating an audio signal variance based on the processed network packets containing the audio signal. The method also includes determining whether the audio signal variance is below a threshold and, in response to determining that the audio signal variance is below the threshold, generating an alert indicating that the audio signal variance is below the threshold.

4.

发明公开
Audio Source Separation using Hyperbolic Embeddings 审中-公开

公开(公告)号：US20240194213A1

公开(公告)日：2024-06-13

申请号：US18191417

申请日：2023-03-28

申请人： Mitsubishi Electric Research Laboratories, Inc.

发明人： Gordon Wichern , Jonathan Le Roux , Darius Petermann , Aswin Shanmugam Subramanian

IPC分类号： G10L21/0308 , G01H3/08 , G10L25/18 , G10L25/21 , G10L25/30 , G10L25/51

CPC分类号： G10L21/0308 , G01H3/08 , G10L25/18 , G10L25/21 , G10L25/30 , G10L25/51

摘要： There is provided an audio processing system and method comprising an input interface that receives an input audio mixture and transforms it into a time-frequency representation defined by values of time-frequency bins, a processor that maps the values of time-frequency bins into a hyperbolic space by executing an embedding neural network trained to associate each time-frequency bin to a high-dimensional embedding and projecting each high-dimensional embedding into the hyperbolic space, and an output interface that accepts a selection of at least a portion of the hyperbolic space and renders selected hyperbolic embeddings falling within the selected portion of the hyperbolic space.

5.

发明公开
Audio Signal Enhancement with Recursive Restoration Employing Deterministic Degradation 审中-公开

公开(公告)号：US20240170003A1

公开(公告)日：2024-05-23

申请号：US18492377

申请日：2023-10-23

申请人： Mitsubishi Electric Research Laboratories, Inc.

发明人： Jonathan Le Roux , François G. Germain , Gordon Wichern , Hao Yen

IPC分类号： G10L21/0308 , G10L25/30

CPC分类号： G10L21/0308 , G10L25/30

摘要： An audio processing system and method for processing audio is disclosed. The audio processing system collects an input audio signal indicative of degraded measurements of a target audio waveform. The input audio signal is restored with recursive restoration that recursively restores the input audio signal until a termination condition is met. A current iteration of the recursive restoration applies a restoration operator configured to restore a degraded audio signal conditioned on a current level of severity of degradation and degrades the degraded audio signal deterministically with a level of severity less than the current level of severity. A target signal estimate indicative of enhanced measurements of the audio waveform is generated as output.

6.

发明授权
Signal separation apparatus, signal separation method and program 有权

公开(公告)号：US11922966B2

公开(公告)日：2024-03-05

申请号：US17276256

申请日：2019-10-01

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Hiroshi Sawada

IPC分类号： G10L21/028 , G06F17/16 , G10L21/0308 , G10L19/038

CPC分类号： G10L21/0308 , G06F17/16 , G10L19/038 , G10L21/028

摘要： A signal separation device for acquiring a source signal from a mixed signal observed by a plurality of sensors includes: a database that stores feature information of a clean signal; separation matrix calculation means for repeatedly performing processes of, based on a separated signal obtained by multiplication of a mixed signal converted into a time-frequency representation by a separation matrix and on the feature information stored in the database, calculating a parameter to be used for an objective function for optimizing the separation matrix, and calculating a separation matrix for minimizing the objective function using the parameter; and output means for outputting a separated signal calculated using the optimized separation matrix obtained by the separation matrix calculation means.

7.

发明授权
Signal separation apparatus, signal separation method and program 有权

公开(公告)号：US11915717B2

公开(公告)日：2024-02-27

申请号：US17292529

申请日：2019-07-01

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Hiroshi Sawada , Rintaro Ikeshita , Nobutaka Ito , Tomohiro Nakatani

IPC分类号： G06F17/16 , G10L21/0308 , H04S3/02

CPC分类号： G10L21/0308 , G06F17/16

摘要： The signal separation device includes: cross product calculation means receiving an input of an observed signal that is a mixture of a plurality of target signals, and calculating a cross product of the observed signal; model calculation means updating a parameter of a model for estimating the cross product with a predetermined algorithm using an inverse matrix of a matrix that represents an estimate of the cross product; inverse matrix calculation means calculating the inverse matrix of a matrix by a SIMD command when the parameter is updated; and separation means calculating the target signals using a matrix representing an estimate of the cross product, the updated parameter, and the observed signal.

8.

发明授权
Voice-controlled management of user profiles 有权

公开(公告)号：US11727939B2

公开(公告)日：2023-08-15

申请号：US17568931

申请日：2022-01-05

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Volodya Grancharov , Tomer Amiaz , Hadar Gecht , Harald Pobloth

IPC分类号： G10L17/02 , G10L17/18 , G10L15/08 , G10L21/0308 , G06F17/18

CPC分类号： G10L17/02 , G06F17/18 , G10L15/08 , G10L21/0308

摘要： A network node in a communication network receives, from a user equipment, a cluster of audio segments. The network node calculates a first confidence measure representing a first probability that a first speaker model represents a speaker of the cluster of audio segments. The network node also calculates a second confidence measure representing a second probability that a second speaker model represents the speaker of the cluster of audio segments. In response to the first confidence measure and the second confidence measure both representing probabilities that are higher than a target probability, the network node updates a first user profile associated with the first speaker model and a second user profile associated with the second speaker model based on a user preference assigned to the cluster of audio segments.

9.

发明授权
Adaptive diarization model and user interface 有权

公开(公告)号：US11710496B2

公开(公告)日：2023-07-25

申请号：US17596861

申请日：2019-07-01

申请人： Google LLC

发明人： Aaron Donsbach , Dirk Padfield

IPC分类号： G10L15/26 , G10L15/08 , G10L21/0308 , G06F3/0481 , G06F3/16 , G10L17/06 , G10L17/24 , G10L21/028

CPC分类号： G10L21/0308 , G06F3/0481 , G06F3/167 , G10L17/06 , G10L17/24 , G10L21/028

摘要： A computing device receives a first audio waveform representing a first utterance and a second utterance. The computing device receives identity data indicating that the first utterance corresponds to a first speaker and the second utterance corresponds to a second speaker. The computing device determines, based on the first utterance, the second utterance, and the identity data, a diarization model configured to distinguish between utterances by the first speaker and utterances by the second speaker. The computing device receives, exclusively of receiving further identity data indicating a source speaker of a third utterance, a second audio waveform representing the third utterance. The computing device determines, by way of the diarization model and independently of the further identity data of the first type, the source speaker of the third utterance. The computing device updates the diarization model based on the third utterance and the determined source speaker.

10.

发明授权
Vowel sensing voice activity detector 有权

公开(公告)号：US11587579B2

公开(公告)日：2023-02-21

申请号：US17394870

申请日：2021-08-05

申请人： Plantronics, Inc.

发明人： Arthur Leland Schiro

IPC分类号： G10L25/87 , G10L25/93 , G10L25/78 , G10L21/0232 , G10L21/0308 , G10K11/175 , G10L21/0208

摘要： Methods and apparatuses for detecting user speech are described. In one example, a method for detecting user speech includes receiving a microphone output signal corresponding to sound received at a microphone and identifying a spoken vowel sound in the microphone signal. The method further includes outputting an indication of user speech detection responsive to identifying the spoken vowel sound.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类