Patent search ap:("SoundHound Page Inc.") AND inv:"Steve PEARSON"

1.

发明申请
VOICE MORPHING APPARATUS HAVING ADJUSTABLE PARAMETERS 有权

公开(公告)号：US20210217431A1

公开(公告)日：2021-07-15

申请号：US16740440

申请日：2020-01-11

Applicant: SoundHound, Inc.

Inventor： Steve PEARSON

IPC: G10L21/013 , G10L21/0208 , G06N3/08 , G06N20/00

Abstract: A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.

2.

发明公开
TEXT-TO-SPEECH SYSTEM WITH VARIABLE FRAME RATE 审中-公开

公开(公告)号：US20240144910A1

公开(公告)日：2024-05-02

申请号：US18051507

申请日：2022-10-31

Applicant: SoundHound, Inc.

Inventor： Steve PEARSON , Jon GROSSMAN

IPC: G10L13/047 , G10L13/06

CPC classification number: G10L13/047 , G10L13/06

Abstract: A neural TTS system is trained to generate key acoustic frames at variable rates while omitting other frames. The frame skipping depends on the acoustic features to be generated for the input text. The TTS system can interpolate frames between the key frames at a target rate for a vocoder to synthesis audio samples.

3.

发明申请
TRAINING A VOICE MORPHING APPARATUS 有权

公开(公告)号：US20210193159A1

公开(公告)日：2021-06-24

申请号：US16740378

申请日：2020-01-10

Applicant: SoundHound, Inc.

Inventor： Steve PEARSON

IPC: G10L21/013 , G10L25/18 , G10L17/00 , G10L25/51 , G10L25/30 , G10L21/02

Abstract: Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.

Patent Agency Ranking