-
公开(公告)号:US20240274149A1
公开(公告)日:2024-08-15
申请号:US18561727
申请日:2021-05-25
发明人: Hiroshi SATO , Tsubasa OCHIAI , Marc DELCROIX , Keisuke KINOSHITA
IPC分类号: G10L21/0308 , G10L21/0208 , G10L25/84
CPC分类号: G10L21/0308 , G10L21/0208 , G10L25/84
摘要: A voice recognition input determination unit includes SIR-SNR acquisition circuitry that acquires, from a mixed voice in which a voice of another speaker overlaps with a voice of a target speaker, at least one of the mixed voice and a set of a signal-to-interference ratio (SIR) that is a ratio of a target voice to an interference speaker voice in the mixed voice and a signal-to-noise ratio (SNR) that is a ratio of the target voice to a noise in the mixed voice. Further, there is determination circuitry that determines a voice based on at least one of the mixed voice and an enhanced voice obtained by enhancing the mixed voice as a voice to be used for voice recognition on the basis of at least one of the mixed voice and the set of the SIR and the SNR.
-
2.
公开(公告)号:US20240265934A1
公开(公告)日:2024-08-08
申请号:US18563940
申请日:2022-02-17
发明人: Yuichiro KOYAMA , Michael HENTSCHEL , Kan KURODA , Masanobu NAKAMURA , Hiroaki OGAWA , Kentaro SHIBATA , Takashi SHIBUYA , Noriko TOTSUKA , Emiru TSUNOO , Toshimitsu UESAKA , Keiichi YAMADA
IPC分类号: G10L21/10 , G10L21/0308
CPC分类号: G10L21/10 , G10L21/0308
摘要: An information processing apparatus according to an embodiment of the present technology includes a signal processing unit. The signal processing unit extracts, from a plurality of observation signals obtained by a group of microphones, voice signals respectively related to the microphones by machine learning. This allows desired signal output. Moreover, a person on the other end of the line can easily hear the voice of a speaker and cannot hear the voice of other speaker, thereby providing a feeling of security and high confidentiality for users.
-
公开(公告)号:US20240257823A1
公开(公告)日:2024-08-01
申请号:US18521676
申请日:2023-11-28
申请人: MIXHalo Corp.
IPC分类号: G10L21/0232 , G08B21/18 , G10L19/022 , G10L21/0308
CPC分类号: G10L21/0232 , G08B21/182 , G10L19/022 , G10L21/0308
摘要: A method for remotely monitoring audio signal variance in real-time by a cloud-based virtual host communicative coupled to an audio server computing device includes receiving and processing network packets that contain an audio signal. The method also includes calculating an audio signal variance based on the processed network packets containing the audio signal. The method also includes determining whether the audio signal variance is below a threshold and, in response to determining that the audio signal variance is below the threshold, generating an alert indicating that the audio signal variance is below the threshold.
-
公开(公告)号:US20240194213A1
公开(公告)日:2024-06-13
申请号:US18191417
申请日:2023-03-28
摘要: There is provided an audio processing system and method comprising an input interface that receives an input audio mixture and transforms it into a time-frequency representation defined by values of time-frequency bins, a processor that maps the values of time-frequency bins into a hyperbolic space by executing an embedding neural network trained to associate each time-frequency bin to a high-dimensional embedding and projecting each high-dimensional embedding into the hyperbolic space, and an output interface that accepts a selection of at least a portion of the hyperbolic space and renders selected hyperbolic embeddings falling within the selected portion of the hyperbolic space.
-
5.
公开(公告)号:US20240170003A1
公开(公告)日:2024-05-23
申请号:US18492377
申请日:2023-10-23
发明人: Jonathan Le Roux , François G. Germain , Gordon Wichern , Hao Yen
IPC分类号: G10L21/0308 , G10L25/30
CPC分类号: G10L21/0308 , G10L25/30
摘要: An audio processing system and method for processing audio is disclosed. The audio processing system collects an input audio signal indicative of degraded measurements of a target audio waveform. The input audio signal is restored with recursive restoration that recursively restores the input audio signal until a termination condition is met. A current iteration of the recursive restoration applies a restoration operator configured to restore a degraded audio signal conditioned on a current level of severity of degradation and degrades the degraded audio signal deterministically with a level of severity less than the current level of severity. A target signal estimate indicative of enhanced measurements of the audio waveform is generated as output.
-
公开(公告)号:US11922966B2
公开(公告)日:2024-03-05
申请号:US17276256
申请日:2019-10-01
发明人: Hiroshi Sawada
IPC分类号: G10L21/028 , G06F17/16 , G10L21/0308 , G10L19/038
CPC分类号: G10L21/0308 , G06F17/16 , G10L19/038 , G10L21/028
摘要: A signal separation device for acquiring a source signal from a mixed signal observed by a plurality of sensors includes: a database that stores feature information of a clean signal; separation matrix calculation means for repeatedly performing processes of, based on a separated signal obtained by multiplication of a mixed signal converted into a time-frequency representation by a separation matrix and on the feature information stored in the database, calculating a parameter to be used for an objective function for optimizing the separation matrix, and calculating a separation matrix for minimizing the objective function using the parameter; and output means for outputting a separated signal calculated using the optimized separation matrix obtained by the separation matrix calculation means.
-
公开(公告)号:US11915717B2
公开(公告)日:2024-02-27
申请号:US17292529
申请日:2019-07-01
IPC分类号: G06F17/16 , G10L21/0308 , H04S3/02
CPC分类号: G10L21/0308 , G06F17/16
摘要: The signal separation device includes: cross product calculation means receiving an input of an observed signal that is a mixture of a plurality of target signals, and calculating a cross product of the observed signal; model calculation means updating a parameter of a model for estimating the cross product with a predetermined algorithm using an inverse matrix of a matrix that represents an estimate of the cross product; inverse matrix calculation means calculating the inverse matrix of a matrix by a SIMD command when the parameter is updated; and separation means calculating the target signals using a matrix representing an estimate of the cross product, the updated parameter, and the observed signal.
-
公开(公告)号:US11727939B2
公开(公告)日:2023-08-15
申请号:US17568931
申请日:2022-01-05
发明人: Volodya Grancharov , Tomer Amiaz , Hadar Gecht , Harald Pobloth
IPC分类号: G10L17/02 , G10L17/18 , G10L15/08 , G10L21/0308 , G06F17/18
CPC分类号: G10L17/02 , G06F17/18 , G10L15/08 , G10L21/0308
摘要: A network node in a communication network receives, from a user equipment, a cluster of audio segments. The network node calculates a first confidence measure representing a first probability that a first speaker model represents a speaker of the cluster of audio segments. The network node also calculates a second confidence measure representing a second probability that a second speaker model represents the speaker of the cluster of audio segments. In response to the first confidence measure and the second confidence measure both representing probabilities that are higher than a target probability, the network node updates a first user profile associated with the first speaker model and a second user profile associated with the second speaker model based on a user preference assigned to the cluster of audio segments.
-
公开(公告)号:US11710496B2
公开(公告)日:2023-07-25
申请号:US17596861
申请日:2019-07-01
申请人: Google LLC
发明人: Aaron Donsbach , Dirk Padfield
IPC分类号: G10L15/26 , G10L15/08 , G10L21/0308 , G06F3/0481 , G06F3/16 , G10L17/06 , G10L17/24 , G10L21/028
CPC分类号: G10L21/0308 , G06F3/0481 , G06F3/167 , G10L17/06 , G10L17/24 , G10L21/028
摘要: A computing device receives a first audio waveform representing a first utterance and a second utterance. The computing device receives identity data indicating that the first utterance corresponds to a first speaker and the second utterance corresponds to a second speaker. The computing device determines, based on the first utterance, the second utterance, and the identity data, a diarization model configured to distinguish between utterances by the first speaker and utterances by the second speaker. The computing device receives, exclusively of receiving further identity data indicating a source speaker of a third utterance, a second audio waveform representing the third utterance. The computing device determines, by way of the diarization model and independently of the further identity data of the first type, the source speaker of the third utterance. The computing device updates the diarization model based on the third utterance and the determined source speaker.
-
公开(公告)号:US11587579B2
公开(公告)日:2023-02-21
申请号:US17394870
申请日:2021-08-05
申请人: Plantronics, Inc.
发明人: Arthur Leland Schiro
IPC分类号: G10L25/87 , G10L25/93 , G10L25/78 , G10L21/0232 , G10L21/0308 , G10K11/175 , G10L21/0208
摘要: Methods and apparatuses for detecting user speech are described. In one example, a method for detecting user speech includes receiving a microphone output signal corresponding to sound received at a microphone and identifying a spoken vowel sound in the microphone signal. The method further includes outputting an indication of user speech detection responsive to identifying the spoken vowel sound.
-
-
-
-
-
-
-
-
-