METHODS AND SYSTEMS FOR TRAINING ARTIFICIAL INTELLIGENCE-BASED MODELS USING LIMITED LABELED DATA

    公开(公告)号:US20240403369A1

    公开(公告)日:2024-12-05

    申请号:US18733518

    申请日:2024-06-04

    Abstract: Methods and systems for training artificial intelligence (AI)-based models using limited labeled data are disclosed. The method performed by a server system includes accessing a tabular dataset including tabular data that further labeled data and unlabeled data. Method includes generating labeled features including labeled numerical features and labeled categorical features based on the labeled data and generating unlabeled features including unlabeled numerical features and unlabeled categorical features based on the unlabeled data. Method includes determining, via a first transformer model, a contextual numerical embeddings based on the labeled numerical features and the unlabeled numerical features. Method includes determining, via a second transformer model, a contextual categorical embeddings based on the labeled categorical features and the unlabeled categorical features. Method includes generating a concatenated embeddings based on concatenating the contextual numerical embeddings and the contextual categorical embeddings. Method includes generating a third transformer model based on the concatenated embeddings.

Patent Agency Ranking