- 专利标题: Cross-language models based on transfer learning
-
申请号: US16521129申请日: 2019-07-24
-
公开(公告)号: US11048887B1公开(公告)日: 2021-06-29
- 发明人: Sparsh Gupta , Igor Podgorny , Faraz Sharafi , Matthew Cannon , Vitor R. Carvalho
- 申请人: Intuit Inc.
- 申请人地址: US CA Mountain View
- 专利权人: Intuit Inc.
- 当前专利权人: Intuit Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Ferguson Braswell Fraser Kubasta PC
- 主分类号: G06F40/00
- IPC分类号: G06F40/00 ; G06F40/58 ; G06F17/16 ; G06N3/08 ; G06K9/62
摘要:
A method for text classification involves generating, using a bilingual embedding model, source language embeddings for source language documents; obtaining source language document labels of the source language documents; and training a source language classifier model and a label embedding network, executing on a computing system, using the source language embeddings and the source language document labels. The method further involves generating pseudo-labels for unlabeled target language documents, by: generating, using the bilingual embedding model, target language embeddings for the unlabeled target language documents, and applying the source language classifier model and the label embedding network to the target language embeddings to obtain the pseudo-labels for the unlabeled target language documents. In addition, the method involves training a target language classifier model executing on the computing system using the target language embeddings and the pseudo labels.
信息查询