- 专利标题: Integration of knowledge graph embedding into topic modeling with hierarchical Dirichlet process
-
申请号: US16427225申请日: 2019-05-30
-
公开(公告)号: US11636355B2公开(公告)日: 2023-04-25
- 发明人: Dingcheng Li , Jingyuan Zhang , Ping Li , Siamak Zamani Dadaneh
- 申请人: Baidu USA, LLC
- 申请人地址: US CA Sunnyvale
- 专利权人: Baidu USA, LLC
- 当前专利权人: Baidu USA, LLC
- 当前专利权人地址: US CA Sunnyvale
- 代理机构: North Weber & Baugh LLP
- 主分类号: G06F40/289
- IPC分类号: G06F40/289 ; G06N5/04 ; G06N20/00 ; G06F40/20
摘要:
Leveraging domain knowledge is an effective strategy for enhancing the quality of inferred low-dimensional representations of documents by topic models. Presented herein are embodiments of a Bayesian nonparametric model that employ knowledge graph (KG) embedding in the context of topic modeling for extracting more coherent topics; embodiments of the model may be referred to as topic modeling with knowledge graph embedding (TMKGE). TMKGE embodiments are hierarchical Dirichlet process (HDP)-based models that flexibly borrow information from a KG to improve the interpretability of topics. Also, embodiments of a new, efficient online variational inference method based on a stick-breaking construction of HDP were developed for TMKGE models, making TMKGE suitable for large document corpora and KGs. Experiments on datasets illustrate the superior performance of TMKGE in terms of topic coherence and document classification accuracy, compared to state-of-the-art topic modeling methods.
公开/授权文献
信息查询