Multimodal Learning from Structured and Unstructured Data

Invention Application

US20240386321A1 Multimodal Learning from Structured and Unstructured Data 有权

Please log in to see more content

Patent Title: Multimodal Learning from Structured and Unstructured Data
Application No.: US18639519

Application Date: 2024-04-18
Publication No.: US20240386321A1

Publication Date: 2024-11-21
Inventor: Sayna Ebrahimi , Yihe Dong , Tomas Pfister , Sercan Omer Arik
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Main IPC: G06N20/00
IPC: G06N20/00

Multimodal Learning from Structured and Unstructured Data

Abstract:

Aspects of the disclosure are directed to a multimodal processing system for processing both structured and un-structured data. Real-world data is not always consistent in form or content. The multimodal processing system includes model that can be trained to account for this characteristic of real-world data, by selectively masking data of different modalities during pretraining to learn outputs that are the same or comparable between the masked and un-masked inputs. The model is trained according to modality-specific masking objectives computed for each modality of data and joint modality similarity-based masking objectives for a joint representation of the data across all modalities. The system provides consistent and accurate input, even when input data may have substantial portions of data from different modalities missing. Cross-modal relationships in data are reinforced by the model as different portions of data are masked, contributing to an overall increase in model accuracy versus other approaches.

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N20/00	机器学习