-
公开(公告)号:US12087320B1
公开(公告)日:2024-09-10
申请号:US17671194
申请日:2022-02-14
Applicant: Amazon Technologies, Inc.
Inventor: Qin Zhang , Qingming Tang , Ming Sun , Chao Wang , Steve Mark Lorusso , Andrew Thomas Bydlon , James Garnet Droppo , Viktor Rozgic , Sripal Mehta , Yang Liu
CPC classification number: G10L25/51 , G10L15/1815 , G10L15/22 , G10L15/30
Abstract: A system may be configured to detect custom acoustic events, where the system generates an acoustic event profile for the custom acoustic event based on a natural language description provided by a user and using an audio sample of the described acoustic event. For example, the user may describe the custom acoustic event as “dog bark.” The system may ask the user questions to refine the description (e.g., dog breed, dog gender, age, etc.). Using an audio sample of the refined description, the system may then determine that audio captured in the user's environment is a potential sample of the custom acoustic event. Such captured audio may be presented to the user for confirmation, and then may be used to detect future occurrences of the custom acoustic event in the user's environment.
-
公开(公告)号:US20240071408A1
公开(公告)日:2024-02-29
申请号:US18243804
申请日:2023-09-08
Applicant: Amazon Technologies, Inc.
Inventor: Qingming Tang , Chieh-Chi Kao , Qin Zhang , Ming Sun , Chao Wang , Sumit Garg , Rong Chen , James Garnet Droppo , Chia-Jung Chang
Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
-
公开(公告)号:US11790932B2
公开(公告)日:2023-10-17
申请号:US17547644
申请日:2021-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Qingming Tang , Chieh-Chi Kao , Qin Zhang , Ming Sun , Chao Wang , Sumit Garg , Rong Chen , James Garnet Droppo , Chia-Jung Chang
CPC classification number: G10L25/51 , G06N3/045 , G06N3/08 , G10L25/21 , G10L25/30 , G10L15/08 , G10L15/22 , G10L2015/088 , G10L2015/223
Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
-
公开(公告)号:US12039998B1
公开(公告)日:2024-07-16
申请号:US17665129
申请日:2022-02-04
Applicant: Amazon Technologies, Inc.
Inventor: Chieh-Chi Kao , Qingming Tang , Ming Sun , Viktor Rozgic , Spyridon Matsoukas , Chao Wang
Abstract: An acoustic event detection system may employ self-supervised federated learning to update encoder and/or classifier machine learning models. In an example operation, an encoder may be pre-trained to extract audio feature data from an audio signal. A decoder may be pre-trained to predict a subsequent portion of audio data (e.g., a subsequent frame of audio data represented by log filterbank energies). The encoder and decoder may be trained using self-supervised learning to improve the decoder's predictions and, by extension, the quality of the audio feature data generated by the encoder. The system may apply federated learning to share encoder updates across user devices. The system may fine-tune the classifier to improve inferences based on the improved audio feature data. The system may distribute classifier updates to the user device(s) to update the on-device classifier.
-
公开(公告)号:US11961514B1
公开(公告)日:2024-04-16
申请号:US17547610
申请日:2021-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Chia-Jung Chang , Qingming Tang , Ming Sun , Chao Wang
CPC classification number: G10L15/16
Abstract: An acoustic event detection system may employ one or more recurrent neural networks (RNNs) to extract features from audio data, and use the extracted features to determine the presence of an acoustic event. The system may use self-attention to emphasize features extracted from portions of audio data that may include features more useful for detecting acoustic events. The system may perform self-attention in an iterative manner to reduce the amount of memory used to store hidden states of the RNN while processing successive portions of the audio data. The system may process the portions of the audio data using the RNN to generate a hidden state for each portion. The system may calculate an interim embedding for each hidden state. An interim embedding calculated for the last hidden state may be normalized to determine a final embedding representing features extracted from the input data by the RNN.
-
公开(公告)号:US20230186939A1
公开(公告)日:2023-06-15
申请号:US17547644
申请日:2021-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Qingming Tang , Chieh-Chi Kao , Qin Zhang , Ming Sun , Chao Wang , Sumit Garg , Rong Chen , James Garnet Droppo , Chia-Jung Chang
Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
-
公开(公告)号:US11069352B1
公开(公告)日:2021-07-20
申请号:US16278440
申请日:2019-02-18
Applicant: Amazon Technologies, Inc.
Inventor: Qingming Tang , Ming Sun , Chieh-Chi Kao , Chao Wang , Viktor Rozgic
Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.
-
-
-
-
-
-