-
1.
公开(公告)号:US20250036940A1
公开(公告)日:2025-01-30
申请号:US18637161
申请日:2024-04-16
Applicant: Apple Inc.
Inventor: Ali MOIN , Ran LIU , Ellen L. ZIPPI , Mohammad Hadi POUR ANSARI , Christopher M. SANDINO , Erdrin AZEMI
IPC: G06N3/08
Abstract: The subject technology provides frequency-aware masked autoencoders for multimodal pretraining on frequency-based signals. An apparatus receives input data comprising frequency-based signal information associated with one or more modalities. The apparatus transforms the input data from a time domain to a frequency domain. The apparatus generates a frequency-embedded latent representation of the input data comprising time-domain and frequency-domain information. The apparatus also generates a masked frequency-embedded latent representation by masking one or more frequency components in the frequency-embedded latent representation. The apparatus produces a trained machine learning model by training a neural network to predict one or more masked frequency components of the frequency-embedded latent representation.