Patent search cpc:"G10L2021/0135" Page 1

1.

发明授权
System and method for voice morphing in a data annotator tool 有权

公开(公告)号：US12086564B2

公开(公告)日：2024-09-10

申请号：US17539182

申请日：2021-11-30

Applicant: SoundHound, Inc.

Inventor： Dylan H. Ross

IPC: G10L15/18 , G06F40/56 , G06F40/58 , G10L15/06 , G10L19/125 , G10L19/26 , G10L21/013

CPC classification number: G06F40/56 , G06F40/58 , G10L15/06 , G10L15/18 , G10L19/125 , G10L19/265 , G10L21/013 , G10L2021/0135

Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift. Labeling the morphed speech comprises at least one or more of transcribing the morphed speech, identifying a gender of the speaker, identifying an accent of the speaker, and identifying a noise type of the morphed speech.

2.

发明授权
Systems, methods and computer program products for generating script elements and call to action components therefor 有权

公开(公告)号：US11978092B2

公开(公告)日：2024-05-07

申请号：US17518109

申请日：2021-11-03

Applicant: Spotify AB

Inventor： Lu Han , Rachel M. Bittner

IPC: G06Q30/0241 , G10L19/00 , G10L21/013 , G10L21/0232 , G06F3/16

CPC classification number: G06Q30/0276 , G10L19/00 , G10L21/013 , G10L21/0232 , G06F3/165 , G10L2021/0135

Abstract: A call to action processor receives an entity datapoint containing data related to an entity, a campaign objective datapoint containing data associated with a campaign objective, at least one definite script element based on the campaign objective, and entity metadata containing data associated with the entity. The call to action further performs generating at least one variable script element based on the entity metadata, presenting to a device the at least one definite script element the at least one variable script element.

3.

发明公开
SYSTEM AND METHOD FOR CREATING TIMBRES 审中-公开

公开(公告)号：US20240119954A1

公开(公告)日：2024-04-11

申请号：US18528244

申请日：2023-12-04

Applicant: Modulate, Inc.

Inventor： William Carter Huffman , Michael Pappas

IPC: G10L21/013 , G10L15/02 , G10L15/06 , G10L15/22 , G10L19/018

CPC classification number: G10L21/013 , G10L15/02 , G10L15/063 , G10L15/22 , G10L19/018 , G10L2015/025 , G10L2021/0135 , G10L25/30

Abstract: A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.

4.

发明授权
Devices and methods for a speech-based user interface 有权

公开(公告)号：US11798526B2

公开(公告)日：2023-10-24

申请号：US17653005

申请日：2022-03-01

Applicant: Google LLC

Inventor： Ioannis Agiomyrgiannakis , Fergus James Henderson

IPC: G10L13/10 , G10L13/033 , G06F3/16 , G10L21/013

CPC classification number: G10L13/033 , G06F3/167 , G10L13/10 , G10L2021/0135

Abstract: A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The device may also select a particular source that is associated with the requested speech output. The device may also generate speech having particular voice characteristics of a particular voice assigned to the particular source.

5.

发明公开
VOICE CONVERSION DEVICE, VOICE CONVERSION METHOD, PROGRAM, AND RECORDING MEDIUM 审中-公开

公开(公告)号：US20230317090A1

公开(公告)日：2023-10-05

申请号：US18043105

申请日：2022-06-01

Applicant: DWANGO CO., LTD.

Inventor： Kazuyuki HIROSHIBA , Yuri ODAGIRI , Shinya KITAOKA

IPC: G10L21/013 , G10L21/04 , G10L15/02

CPC classification number: G10L21/013 , G10L21/04 , G10L15/02 , G10L2021/0135 , G10L2015/025

Abstract: A voice conversion apparatus includes: an input unit that inputs designation of a conversion destination voice; an extraction unit that analyzes a voice signal of a conversion source voice and extracts time series data including a phoneme and a pitch; an adjustment unit that matches a height of the pitch to a height of the designated conversion destination voice; and a generation unit that inputs the phoneme and the pitch to a deep learning model that learns voice data of many people and is capable of synthesizing a designated person's voice in time-series order, and generates a voice signal obtained by synthesizing the designated conversion destination voice.

6.

发明公开
NEURAL PITCH-SHIFTING AND TIME-STRETCHING 审中-公开

公开(公告)号：US20230197093A1

公开(公告)日：2023-06-22

申请号：US17558580

申请日：2021-12-21

Applicant: Adobe Inc. , Northwestern University

Inventor： Maxwell Morrison , Juan Pablo Caceres Chomali , Zeyu Jin , Nicholas Bryan , Bryan A. Pardo

IPC: G10L21/013 , G10L15/02 , G10L15/18 , G10L25/90 , G10L25/30 , G10L19/028 , G10L19/032 , G10L21/04 , G10L25/24 , G10L15/06

CPC classification number: G10L21/013 , G10L15/02 , G10L15/1807 , G10L25/90 , G10L25/30 , G10L19/028 , G10L19/032 , G10L21/04 , G10L25/24 , G10L15/063 , G10L2021/0135

Abstract: Methods for modifying audio data include operations for accessing audio data having a first prosody, receiving a target prosody differing from the first prosody, and computing acoustic features representing samples. Computing respective acoustic features for a sample includes computing a pitch feature as a quantized pitch value of the sample by assigning a pitch value, of the target prosody or the audio data, to at least one of a set of pitch bins having equal widths in cents. Computing the respective acoustic features further includes computing a periodicity feature from the audio data. The respective acoustic features for the sample include the pitch feature, the periodicity feature, and other acoustic features. A neural vocoder is applied to the acoustic features to pitch-shift and time-stretch the audio data from the first prosody toward the target prosody.

7.

发明申请
Devices and Methods for a Speech-Based User Interface 审中-公开

公开(公告)号：US20180144737A1

公开(公告)日：2018-05-24

申请号：US15874051

申请日：2018-01-18

Applicant: Google LLC

Inventor： Ioannis Agiomyrgiannakis , Fergus James Henderson

IPC: G10L13/033 , G06F3/16 , G10L21/013

CPC classification number: G10L13/033 , G06F3/167 , G10L13/10 , G10L2021/0135

Abstract: A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The device may also select a particular source that is associated with the requested speech output. The device may also generate speech having particular voice characteristics of a particular voice assigned to the particular source.

8.

发明授权
Accent correction in speech recognition systems 有权

公开(公告)号：US09870769B2

公开(公告)日：2018-01-16

申请号：US14955311

申请日：2015-12-01

Applicant: International Business Machines Corporation

Inventor： Su Liu , Yi Liu , Cheng Xu , Shi Lei Zhang

IPC: G10L15/06 , G10L15/00 , G10L15/26 , G10L15/187 , G10L15/02 , G10L15/07

CPC classification number: G10L15/187 , G10L15/02 , G10L15/063 , G10L15/075 , G10L15/26 , G10L21/003 , G10L2015/022 , G10L2015/0635 , G10L2021/0135

Abstract: A method comprising receiving an audio input signal comprising speech, determining an accent class corresponding to the speech, identifying an accented phone pattern within the speech, replacing the accented phone pattern with an unaccented phone pattern, and generating an unaccented output signal from the unaccented phone pattern.

9.

发明申请
COMMUNICATION APPARATUS MOUNTED WITH SPEECH SPEED CONVERSION DEVICE 审中-公开

公开(公告)号：US20170345444A1

公开(公告)日：2017-11-30

申请号：US15496900

申请日：2017-04-25

Applicant: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventor： Toshimichi Tokuda

IPC: G10L21/043 , G10L25/78 , G10L21/038 , G10L21/057 , G10L21/0364

CPC classification number: G10L21/043 , G10L21/0364 , G10L21/038 , G10L21/057 , G10L25/78 , G10L25/90 , G10L2021/0135 , G10L2025/783 , H04M1/6016 , H04M1/642 , H04M1/6505

Abstract: In a communication apparatus, an encoder compresses telephone call voice which is transmitted from another communication apparatus. A voice accumulator preserves the telephone call voice, which is compressed by the encoder, as a message. A decoder expands the telephone call voice which is preserved in the voice accumulator. A signal memory temporarily maintains the telephone call voice which is expanded by the decoder. A speech speed convertor performs speech speed conversion on the telephone call voice, which is read from the signal memory, and outputs resulting voice from a speaker. A memory monitor temporarily stops to expand the telephone call voice in the decoder in a case where the memory monitor determines that an idle capacity of the signal memory approaches a predetermined lower limit value.

10.

发明授权
Method and apparatus for using a vocal sample to customize text to speech applications 有权

公开(公告)号：US09830903B2

公开(公告)日：2017-11-28

申请号：US14757028

申请日：2015-11-10

Applicant: Paul Wendell Mason

Inventor： Paul Wendell Mason

IPC: G10L13/00 , G10L13/027 , G10L13/033 , G10L13/04 , G10L21/007 , G10L25/48

CPC classification number: G10L13/027 , G10L13/0335 , G10L13/043 , G10L21/007 , G10L25/48 , G10L2021/0135

Abstract: Apparatus and methods consistent with the present invention measure one or more of the characteristics of a voice recording and use such measurements to create a synthetic voice that approximates the recorded voice and uses such created synthetic voice to verbalize the content of an electronically conveyed written message such as an SMS text message. The vocal characteristics measured may include frequency, timbre, intensity, rhythm, and rate of speech as well as others.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification