Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Sundararajan Srinivasan"

1.

发明授权
Speech processing using embedding data 有权

公开(公告)号：US11282495B2

公开(公告)日：2022-03-22

申请号：US16712567

申请日：2019-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Hongda Mao , George Yu-Chien Lin , Sundararajan Srinivasan , Chu-Cheng Hsieh

IPC: G10L13/027 , G10L13/00 , G10L17/04 , G10L17/18

Abstract: A first neural network model of a user device processes audio data to extract audio embeddings that represent vocal characteristics of a user of an utterance represented in the audio data. The audio embeddings may then be hashed to remove characteristics specific to the user while still maintaining a unique set of characteristics. The hashed embeddings may be sent to a remote system, which may use them to identify the user.

2.

发明申请
VOICE PROFILE UPDATING 有权

公开(公告)号：US20210304774A1

公开(公告)日：2021-09-30

申请号：US17228950

申请日：2021-04-13

Applicant: Amazon Technologies, Inc.

Inventor： Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad

IPC: G10L17/04 , G06F3/16 , G10L17/00 , G10L15/06

Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.

3.

发明授权
Speech based user recognition 有权

公开(公告)号：US11893999B1

公开(公告)日：2024-02-06

申请号：US16055755

申请日：2018-08-06

Applicant: Amazon Technologies, Inc.

Inventor： Sai Sailesh Kopuri , John Moore , Sundararajan Srinivasan , Aparna Khare , Arindam Mandal , Spyridon Matsoukas , Rohit Prasad

IPC: G10L17/22 , G10L17/04 , G10L17/10 , G06F40/20

CPC classification number: G10L17/22 , G06F40/20 , G10L17/04 , G10L17/10

Abstract: Techniques for enrolling a user in a system's user recognition functionality without requiring the user speak particular speech are described. The system may determine characteristics unique to a user input. The system may generate an implicit voice profile from user inputs having similar characteristics. After an implicit voice profile is generated, the system may receive a user input having speech characteristics similar to that of the implicit voice profile. The system may ask the user if the user wants the system to associate the implicit voice profile with a particular user identifier. If the user responds affirmatively, the system may request an identifier of a user profile (e.g., a user name). In response to receiving the user's name, the system may identify a user profile associated with the name and associate the implicit voice profile with the user profile, thereby converting the implicit voice profile into an explicit voice profile.

4.

发明申请
SPEECH PROCESSING 有权

公开(公告)号：US20210183358A1

公开(公告)日：2021-06-17

申请号：US16712567

申请日：2019-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Hongda Mao , George Yu-Chien Lin , Sundararajan Srinivasan , Chu-Cheng Hsieh

IPC: G10L13/027 , G10L13/04 , G10L17/04 , G10L17/18

Abstract: A first neural network model of a user device processes audio data to extract audio embeddings that represent vocal characteristics of a user of an utterance represented in the audio data. The audio embeddings may then be hashed to remove characteristics specific to the user while still maintaining a unique set of characteristics. The hashed embeddings may be sent to a remote system, which may use them to identify the user.

5.

发明授权
Voice profile updating 有权

公开(公告)号：US11004454B1

公开(公告)日：2021-05-11

申请号：US16182021

申请日：2018-11-06

Applicant: Amazon Technologies, Inc.

Inventor： Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad

IPC: G10L17/04 , G06F3/16 , G10L17/00 , G10L15/06

Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.

6.

发明申请
GUIDING TRANSCRIPT GENERATION USING DETECTED SECTION TYPES AS PART OF AUTOMATIC SPEECH RECOGNITION 有权

公开(公告)号：US20250029612A1

公开(公告)日：2025-01-23

申请号：US18356117

申请日：2023-07-20

Applicant: Amazon Technologies, Inc.

Inventor： Lei Xu , Aparna Elangovan , Rohit Paturi , Sundararajan Srinivasan , Sravan BAbu Bodapati , Katrin Kirchoff , Sarthak Handa

IPC: G10L15/26 , G06F40/20

Abstract: Transcript generation as part of automatic speech recognition may be guided using section types. Audio data is received for transcription. An initial transcript of the audio data may be generated and evaluated to determine a section type for the audio data. The section type may then be used to focus generation of a second version of the transcript on one speaker over another speaker.

7.

发明授权
Voice profile updating 有权

公开(公告)号：US11200884B1

公开(公告)日：2021-12-14

申请号：US16181925

申请日：2018-11-06

Applicant: Amazon Technologies, Inc.

Inventor： Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad

IPC: G10L15/06 , G06F3/16 , G10L15/18 , G10L15/22

Abstract: Techniques for labeling user inputs for updating user recognition voice profiles are described. A system may leverage various signals, generated during or after processing of a user input, to retroactively determine which user spoke the user input. For example, after the system receives the user input, the user may provide the system with non-spoken user verification information. Based on such user verification information, the system may label the previously spoken user input as originating from the particular user. The system may also or alternatively use system usage history to retroactively label user inputs.

Patent Agency Ranking