-
公开(公告)号:US11282495B2
公开(公告)日:2022-03-22
申请号:US16712567
申请日:2019-12-12
Applicant: Amazon Technologies, Inc.
Inventor: Hongda Mao , George Yu-Chien Lin , Sundararajan Srinivasan , Chu-Cheng Hsieh
IPC: G10L13/027 , G10L13/00 , G10L17/04 , G10L17/18
Abstract: A first neural network model of a user device processes audio data to extract audio embeddings that represent vocal characteristics of a user of an utterance represented in the audio data. The audio embeddings may then be hashed to remove characteristics specific to the user while still maintaining a unique set of characteristics. The hashed embeddings may be sent to a remote system, which may use them to identify the user.
-
公开(公告)号:US20210304774A1
公开(公告)日:2021-09-30
申请号:US17228950
申请日:2021-04-13
Applicant: Amazon Technologies, Inc.
Inventor: Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad
Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.
-
公开(公告)号:US11893999B1
公开(公告)日:2024-02-06
申请号:US16055755
申请日:2018-08-06
Applicant: Amazon Technologies, Inc.
Inventor: Sai Sailesh Kopuri , John Moore , Sundararajan Srinivasan , Aparna Khare , Arindam Mandal , Spyridon Matsoukas , Rohit Prasad
Abstract: Techniques for enrolling a user in a system's user recognition functionality without requiring the user speak particular speech are described. The system may determine characteristics unique to a user input. The system may generate an implicit voice profile from user inputs having similar characteristics. After an implicit voice profile is generated, the system may receive a user input having speech characteristics similar to that of the implicit voice profile. The system may ask the user if the user wants the system to associate the implicit voice profile with a particular user identifier. If the user responds affirmatively, the system may request an identifier of a user profile (e.g., a user name). In response to receiving the user's name, the system may identify a user profile associated with the name and associate the implicit voice profile with the user profile, thereby converting the implicit voice profile into an explicit voice profile.
-
公开(公告)号:US20210183358A1
公开(公告)日:2021-06-17
申请号:US16712567
申请日:2019-12-12
Applicant: Amazon Technologies, Inc.
Inventor: Hongda Mao , George Yu-Chien Lin , Sundararajan Srinivasan , Chu-Cheng Hsieh
IPC: G10L13/027 , G10L13/04 , G10L17/04 , G10L17/18
Abstract: A first neural network model of a user device processes audio data to extract audio embeddings that represent vocal characteristics of a user of an utterance represented in the audio data. The audio embeddings may then be hashed to remove characteristics specific to the user while still maintaining a unique set of characteristics. The hashed embeddings may be sent to a remote system, which may use them to identify the user.
-
公开(公告)号:US11004454B1
公开(公告)日:2021-05-11
申请号:US16182021
申请日:2018-11-06
Applicant: Amazon Technologies, Inc.
Inventor: Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad
Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.
-
6.
公开(公告)号:US20250029612A1
公开(公告)日:2025-01-23
申请号:US18356117
申请日:2023-07-20
Applicant: Amazon Technologies, Inc.
Inventor: Lei Xu , Aparna Elangovan , Rohit Paturi , Sundararajan Srinivasan , Sravan BAbu Bodapati , Katrin Kirchoff , Sarthak Handa
Abstract: Transcript generation as part of automatic speech recognition may be guided using section types. Audio data is received for transcription. An initial transcript of the audio data may be generated and evaluated to determine a section type for the audio data. The section type may then be used to focus generation of a second version of the transcript on one speaker over another speaker.
-
公开(公告)号:US11200884B1
公开(公告)日:2021-12-14
申请号:US16181925
申请日:2018-11-06
Applicant: Amazon Technologies, Inc.
Inventor: Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad
Abstract: Techniques for labeling user inputs for updating user recognition voice profiles are described. A system may leverage various signals, generated during or after processing of a user input, to retroactively determine which user spoke the user input. For example, after the system receives the user input, the user may provide the system with non-spoken user verification information. Based on such user verification information, the system may label the previously spoken user input as originating from the particular user. The system may also or alternatively use system usage history to retroactively label user inputs.
-
-
-
-
-
-