-
公开(公告)号:US20250086389A1
公开(公告)日:2025-03-13
申请号:US18367310
申请日:2023-09-12
Inventor: Gayathri SARANATHAN , Nway Nway AUNG , Ariel BECK , Chandra Suwandi WIJAYA , Jianyu CHEN , Debdeep PAUL , Sahim YAMAURA , Koji MIURA
IPC: G06F40/279 , G06N20/00
Abstract: According to an embodiment, a method for generating textual features corresponding to text documents from a raw dataset is disclosed. The method includes preprocessing the text documents and determining topic probability scores (TPS) and confidence scores (CS) using unsupervised and supervised machine learning models, respectively. The combination of TPS and CS is used to generate a compound distribution score (CDS), which forms a comprehensive representation of the output of the machine learning models. The determined TPS, CS, and CDS are then used to generate a set of textual features, which serve as independent variables for a forecasting model.