-
公开(公告)号:US20240104352A1
公开(公告)日:2024-03-28
申请号:US18012391
申请日:2022-07-28
Applicant: Google LLC
Inventor: Yu Zhang , Yu-An Chung , Wei Han , Chung-Cheng Chiu , Weikeng Qin , Ruoming Pang , Yonghui Wu
IPC: G06N3/0455
CPC classification number: G06N3/0455
Abstract: Provided are improved end-to-end self-supervised pre-training frameworks that leverage a combination of contrastive and masked modeling loss terms. In particular, the present disclosure provides framework that combines contrastive learning and masked modeling, where the former trains the model to discretize input data (e.g., continuous signals such as continuous speech signals) into a finite set of discriminative tokens, and the latter trains the model to learn contextualized representations via solving a masked prediction task consuming the discretized tokens. In contrast to certain existing masked modeling-based pre-training frameworks which rely on an iterative re-clustering and re-training process or other existing frameworks which concatenate two separately trained modules, the proposed framework can enable a model to be optimized in an end-to-end fashion by solving the two self-supervised tasks (the contrastive task and masked modeling) simultaneously.