Patent search ap:("GOOGLE LLC") AND inv:"Rakesh Iyer" Page 1

1.

发明授权
Predicting parametric vocoder parameters from prosodic features 有权

公开(公告)号：US11232780B1

公开(公告)日：2022-01-25

申请号：US17033783

申请日：2020-09-26

Applicant: Google LLC

Inventor： Rakesh Iyer , Vincent Wan

IPC: G10L13/027 , G10L13/10

Abstract: A method for predicting parametric vocoder parameter includes receiving a text utterance having one or more words, each word having one or more syllables, and each syllable having one or more phonemes. The method also includes receiving, as input to a vocoder model, prosodic features that represent an intended prosody for the text utterance and a linguistic specification. The prosodic features include a duration, pitch contour, and energy contour for the text utterance, while the linguistic specification includes sentence-level linguistic features, word-level linguistic features for each word, syllable-level linguistic features for each syllable, and phoneme-level linguistic features for each phoneme. The method also includes predicting vocoder parameters based on the prosodic features and the linguistic specification. The method also includes providing the predicted vocoder parameters and the prosodic features to a parametric vocoder configured to generate a synthesized speech representation of the text utterance having the intended prosody.

2.

发明申请
Systems and Methods for Extracting Information from a Physical Document 有权

公开(公告)号：US20210406451A1

公开(公告)日：2021-12-30

申请号：US17291647

申请日：2019-01-28

Applicant: Google LLC

Inventor： Rakesh Iyer , Lisha Ruan

IPC: G06F40/169 , G06K9/46 , G06K9/00 , G06K9/20

Abstract: Systems and methods for extracting information from documents are provided. In one example embodiment, a computer-implemented method includes obtaining one or more units of text from an image of a document. The method includes determining one or more annotated values from the one or more units of text and determining a set of candidate labels for each annotated value. The method determines each set of candidate labels by performing a search for the candidate labels based at least in part on a language associated with the document and a location of each annotated value. The method includes determining a canonical label for each annotated value based at least in part on the associated candidate labels, and mapping at least one annotated value to an action that is presented to a user based at least in part on the canonical label associated with the annotated value.

3.

发明授权
Predicting parametric vocoder parameters from prosodic features 有权

公开(公告)号：US12125469B2

公开(公告)日：2024-10-22

申请号：US18488735

申请日：2023-10-17

Applicant: Google LLC

Inventor： Rakesh Iyer , Vincent Wan

IPC: G10L13/10 , G10L13/027

CPC classification number: G10L13/027 , G10L13/10

Abstract: A method for predicting parametric vocoder parameter includes receiving a text utterance having one or more words, each word having one or more syllables, and each syllable having one or more phonemes. The method also includes receiving, as input to a vocoder model, prosodic features that represent an intended prosody for the text utterance and a linguistic specification. The prosodic features include a duration, pitch contour, and energy contour for the text utterance, while the linguistic specification includes sentence-level linguistic features, word-level linguistic features for each word, syllable-level linguistic features for each syllable, and phoneme-level linguistic features for each phoneme. The method also includes predicting vocoder parameters based on the prosodic features and the linguistic specification. The method also includes providing the predicted vocoder parameters and the prosodic features to a parametric vocoder configured to generate a synthesized speech representation of the text utterance having the intended prosody.

4.

发明授权
Systems and methods for extracting information from a physical document 有权

公开(公告)号：US12033412B2

公开(公告)日：2024-07-09

申请号：US17291647

申请日：2019-01-28

Applicant: Google LLC

Inventor： Rakesh Iyer , Lisha Ruan

IPC: G06V30/40 , G06F40/169 , G06V30/10

CPC classification number: G06V30/40 , G06F40/169 , G06V30/10

Abstract: Systems and methods for extracting information from documents are provided. In one example embodiment, a computer-implemented method includes obtaining one or more units of text from an image of a document. The method includes determining one or more annotated values from the one or more units of text and determining a set of candidate labels for each annotated value. The method determines each set of candidate labels by performing a search for the candidate labels based at least in part on a language associated with the document and a location of each annotated value. The method includes determining a canonical label for each annotated value based at least in part on the associated candidate labels, and mapping at least one annotated value to an action that is presented to a user based at least in part on the canonical label associated with the annotated value.

5.

发明申请
Systems and Methods for Extracting Information from a Physical Document 有权

公开(公告)号：US20240404308A1

公开(公告)日：2024-12-05

申请号：US18671218

申请日：2024-05-22

Applicant: Google LLC

Inventor： Rakesh Iyer , Lisha Ruan

IPC: G06V30/40 , G06F40/169 , G06V30/10

Abstract: Systems and methods for extracting information from documents are provided. In one example embodiment, a computer-implemented method includes obtaining one or more units of text from an image of a document. The method includes determining one or more annotated values from the one or more units of text and determining a set of candidate labels for each annotated value. The method determines each set of candidate labels by performing a search for the candidate labels based at least in part on a language associated with the document and a location of each annotated value. The method includes determining a canonical label for each annotated value based at least in part on the associated candidate labels, and mapping at least one annotated value to an action that is presented to a user based at least in part on the canonical label associated with the annotated value.

6.

发明公开
AUTOMATIC ADAPTATION OF THE SYNTHESIZED SPEECH OUTPUT OF A TRANSLATION APPLICATION 审中-公开

公开(公告)号：US20240331681A1

公开(公告)日：2024-10-03

申请号：US18128107

申请日：2023-03-29

Applicant: GOOGLE LLC

Inventor： Rakesh Iyer , Jeffrey Robert Pitman , Pendar Yousefi , Te I , Tiruvilwamalai Raman

IPC: G10L13/047 , G06F40/58 , G10L13/033 , G10L13/08 , G10L15/00 , G10L15/16 , G10L15/22 , G10L25/90

CPC classification number: G10L13/047 , G06F40/58 , G10L13/0335 , G10L13/08 , G10L15/005 , G10L15/16 , G10L15/22 , G10L25/90

Abstract: A computer generated voice can automatically be adapted to be similar to a user's voice. Various implementations include processing audio data capturing a first language spoken utterance to identify one or more pitch characteristics. For example, the one or more pitch characteristics can include an estimated frequency range of the given user's voice. Additionally or alternatively, the system can process the audio data capturing the first language spoken utterance and a set of candidate computer generated voices using a computer generated voice selection model to select a candidate computer generated voice. Various implementations can include automatically modifying the selected candidate computer generated voice based on the one or more pitch characteristics to change the frequency range of the modified computer generated voice based on the user's voice.

7.

发明申请
GENERATING SYNTHESIZED SPEECH INPUT 有权

公开(公告)号：US20230097338A1

公开(公告)日：2023-03-30

申请号：US17533401

申请日：2021-11-23

Applicant: GOOGLE LLC

Inventor： Nnamdi Kalu , Fernando Fernandes , Uri First , Erwin Jansen , Rakesh Iyer , Lingfeng Yang

IPC: G10L13/08 , G10L13/02 , G10L15/26 , G06F40/279

Abstract: Systems and methods for synthesizing speech based on received text and one or more emulated speech parameters. Text is received with one or more emulated speech parameters that indicate one or more features for the synthesized speech. Synthesized speech audio is generated based on the received parameters. The synthesized speech audio data is provided to an emulated microphone component that provides the synthesized audio to an automatic speech recognizer. The automatic speech recognizer utilizes one or more speech recognition models to generate converted text based on the synthesized speech audio data.

8.

发明公开
Predicting Parametric Vocoder Parameters From Prosodic Features 审中-公开

公开(公告)号：US20240046915A1

公开(公告)日：2024-02-08

申请号：US18488735

申请日：2023-10-17

Applicant: Google LLC

Inventor： Rakesh Iyer , Vincent Wan

IPC: G10L13/027 , G10L13/10

CPC classification number: G10L13/027 , G10L13/10

Abstract: A method for predicting parametric vocoder parameter includes receiving a text utterance having one or more words, each word having one or more syllables, and each syllable having one or more phonemes. The method also includes receiving, as input to a vocoder model, prosodic features that represent an intended prosody for the text utterance and a linguistic specification. The prosodic features include a duration, pitch contour, and energy contour for the text utterance, while the linguistic specification includes sentence-level linguistic features, word-level linguistic features for each word, syllable-level linguistic features for each syllable, and phoneme-level linguistic features for each phoneme. The method also includes predicting vocoder parameters based on the prosodic features and the linguistic specification. The method also includes providing the predicted vocoder parameters and the prosodic features to a parametric vocoder configured to generate a synthesized speech representation of the text utterance having the intended prosody.

9.

发明授权
Predicting parametric vocoder parameters from prosodic features 有权

公开(公告)号：US11830474B2

公开(公告)日：2023-11-28

申请号：US17647246

申请日：2022-01-06

Applicant: Google LLC

Inventor： Rakesh Iyer , Vincent Wan

IPC: G10L13/10 , G10L13/027

CPC classification number: G10L13/027 , G10L13/10

Abstract: A method for predicting parametric vocoder parameter includes receiving a text utterance having one or more words, each word having one or more syllables, and each syllable having one or more phonemes. The method also includes receiving, as input to a vocoder model, prosodic features that represent an intended prosody for the text utterance and a linguistic specification. The prosodic features include a duration, pitch contour, and energy contour for the text utterance, while the linguistic specification includes sentence-level linguistic features, word-level linguistic features for each word, syllable-level linguistic features for each syllable, and phoneme-level linguistic features for each phoneme. The method also includes predicting vocoder parameters based on the prosodic features and the linguistic specification. The method also includes providing the predicted vocoder parameters and the prosodic features to a parametric vocoder configured to generate a synthesized speech representation of the text utterance having the intended prosody.

10.

发明申请
Predicting Parametric Vocoder Parameters From Prosodic Features 有权

公开(公告)号：US20220130371A1

公开(公告)日：2022-04-28

申请号：US17647246

申请日：2022-01-06

Applicant: Google LLC

Inventor： Rakesh Iyer , Vincent Wan

IPC: G10L13/027 , G10L13/10

Abstract: A method for predicting parametric vocoder parameter includes receiving a text utterance having one or more words, each word having one or more syllables, and each syllable having one or more phonemes. The method also includes receiving, as input to a vocoder model, prosodic features that represent an intended prosody for the text utterance and a linguistic specification. The prosodic features include a duration, pitch contour, and energy contour for the text utterance, while the linguistic specification includes sentence-level linguistic features, word-level linguistic features for each word, syllable-level linguistic features for each syllable, and phoneme-level linguistic features for each phoneme. The method also includes predicting vocoder parameters based on the prosodic features and the linguistic specification. The method also includes providing the predicted vocoder parameters and the prosodic features to a parametric vocoder configured to generate a synthesized speech representation of the text utterance having the intended prosody.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification