FREQUENCY-AWARE MASKED AUTOENCODERS FOR MULTIMODAL PRETRAINING ON FREQUENCY-BASED SIGNALS

    公开(公告)号:US20250036940A1

    公开(公告)日:2025-01-30

    申请号:US18637161

    申请日:2024-04-16

    Applicant: Apple Inc.

    Abstract: The subject technology provides frequency-aware masked autoencoders for multimodal pretraining on frequency-based signals. An apparatus receives input data comprising frequency-based signal information associated with one or more modalities. The apparatus transforms the input data from a time domain to a frequency domain. The apparatus generates a frequency-embedded latent representation of the input data comprising time-domain and frequency-domain information. The apparatus also generates a masked frequency-embedded latent representation by masking one or more frequency components in the frequency-embedded latent representation. The apparatus produces a trained machine learning model by training a neural network to predict one or more masked frequency components of the frequency-embedded latent representation.

Patent Agency Ranking