- 专利标题: METHOD AND SYSTEM FOR AUTOMATED COLUMN TYPE ANNOTATION
-
申请号: US18338302申请日: 2023-06-20
-
公开(公告)号: US20230418802A1公开(公告)日: 2023-12-28
- 发明人: Martin Ringsquandl , Mitchell Joblin , Aneta Koleva , Swathi Shyam Sunder
- 申请人: Siemens Aktiengesellschaft
- 申请人地址: DE München
- 专利权人: Siemens Aktiengesellschaft
- 当前专利权人: Siemens Aktiengesellschaft
- 当前专利权人地址: DE München
- 优先权: EP 180444.6 2022.06.22
- 主分类号: G06F16/22
- IPC分类号: G06F16/22 ; G06F16/21
摘要:
A solution for automated column type annotation maps each column contained in a table to a column annotation class. A pre-processor transforms the table into a numerical tensor representation by outputting a sequence of cell tokens for each cell in the table. A table encoder encodes the sequences of cell tokens and a column annotation label for each column into body cell embeddings. A body pooling component processes the body cell embeddings to provide column representations. A classifier classifies the column representations to provide for each column, confidence scores for each column annotation class. The method concludes with comparing the highest confidence score for each column with a threshold, and, if the highest confidence score for each column is above the threshold, annotating each column with the respective column annotation class.
信息查询