- 专利标题: TRANSFORMER-BASED ENCODING INCORPORATING METADATA
-
申请号: US17308575申请日: 2021-05-05
-
公开(公告)号: US20220358288A1公开(公告)日: 2022-11-10
- 发明人: Hui Wan , Xiaodong Cui , Luis A. Lastras-Montano
- 申请人: International Business Machines Corporation
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 主分类号: G06F40/284
- IPC分类号: G06F40/284 ; G06F40/205 ; G06F40/237 ; G06F40/30 ; G06F40/42 ; G06K9/66
摘要:
From metadata of a corpus of natural language text documents, a relativity matrix is constructed, a row-column intersection in the relativity matrix corresponding to a relationship between two instances of a type of metadata. An encoder model is trained, generating a trained encoder model, to compute an embedding corresponding to a token of a natural language text document within the corpus and the relativity matrix, the encoder model comprising a first encoder layer, the first encoder layer comprising a token embedding portion, a relativity embedding portion, a token self-attention portion, a metadata self-attention portion, and a fusion portion, the training comprising adjusting a set of parameters of the encoder model.
公开/授权文献
- US11893346B2 Transformer-based encoding incorporating metadata 公开/授权日:2024-02-06
信息查询