-
公开(公告)号:US20200349921A1
公开(公告)日:2020-11-05
申请号:US16758564
申请日:2018-10-26
Applicant: Google LLC
Inventor: Aren Jansen , Manoj Plakal , Richard Channing Moore , Shawn Hershey , Ratheet Pandya , Ryan Rifkin , Jiayang Liu , Daniel Ellis
Abstract: Methods are provided for generating training triplets that can be used to train multidimensional embeddings to represent the semantic content of non-speech sounds present in a corpus of audio recordings. These training triplets can be used with a triplet loss function to train the multidimensional embeddings such that the embeddings can be used to cluster the contents of a corpus of audio recordings, to facilitate a query-by-example lookup from the corpus, to allow a small number of manually-labeled audio recordings to be generalized, or to facilitate some other audio classification task. The triplet sampling methods may be used individually or collectively, and each represent a respective heuristic about the semantic structure of audio recordings.
-
公开(公告)号:US11335328B2
公开(公告)日:2022-05-17
申请号:US16758564
申请日:2018-10-26
Applicant: Google LLC
Inventor: Aren Jansen , Manoj Plakal , Richard Channing Moore , Shawn Hershey , Ratheet Pandya , Ryan Rifkin , Jiayang Liu , Daniel Ellis
Abstract: Methods are provided for generating training triplets that can be used to train multidimensional embeddings to represent the semantic content of non-speech sounds present in a corpus of audio recordings. These training triplets can be used with a triplet loss function to train the multidimensional embeddings such that the embeddings can be used to cluster the contents of a corpus of audio recordings, to facilitate a query-by-example lookup from the corpus, to allow a small number of manually-labeled audio recordings to be generalized, or to facilitate some other audio classification task. The triplet sampling methods may be used individually or collectively, and each represent a respective heuristic about the semantic structure of audio recordings.
-