-
1.
公开(公告)号:US12014398B2
公开(公告)日:2024-06-18
申请号:US18250525
申请日:2021-07-07
发明人: Hongliang Fei , Jingyuan Zhang , Xingxuan Zhou , Junhao Zhao , Banghu Yin , Ping Li
IPC分类号: G06Q30/0251 , G06Q30/0201
CPC分类号: G06Q30/0269 , G06Q30/0201
摘要: Deep neural network (DNN) models have been widely used for user-relevance content prediction. Presented herein are embodiments of a new user-relevance framework, which may be referred as Gating-Enhanced Multi-task Neural Networks (GemNN) embodiments. Neural network-based multi-task learning model embodiments herein predict user engagement with content in a coarse-to-fine manner, which gradually reduces content candidates and allows parameter sharing from upstream tasks to downstream tasks to improve the training efficiency. Also, in one or more embodiments, a gating mechanism was introduced between embedding layers and multi-layer perceptions to learn feature interactions and control the information flow fed to MLP layers. Tested embodiments demonstrated considerable improvements over prior approaches.
-
2.
公开(公告)号:US11615311B2
公开(公告)日:2023-03-28
申请号:US16691554
申请日:2019-11-21
申请人: Baidu USA, LLC
发明人: Dingcheng Li , Jingyuan Zhang , Ping Li
摘要: Described herein are embodiments of a unified neural network framework to integrate Topic modeling, Word embedding and Entity Embedding (TWEE) for representation learning of inputs. In one or more embodiments, a novel topic sparse autoencoder is introduced to incorporate discriminative topics into the representation learning of the input. Topic distributions of inputs are generated from a global viewpoint and are utilized to enable autoencoder to learn topical representations. A sparsity constraint may be added to ensure that the most discriminative representations are related to topics. In addition, both words and entity related information may be embedded into the network to help learn a more comprehensive input representation. Extensive empirical experiments show that embodiments of the TWEE framework outperform the state-of-the-art methods on different datasets.
-
公开(公告)号:US11568266B2
公开(公告)日:2023-01-31
申请号:US16355622
申请日:2019-03-15
申请人: Baidu USA, LLC
发明人: Dingcheng Li , Jingyuan Zhang , Ping Li
摘要: Described herein are embodiments for systems and methods for mutual machine learning with global topic discovery and local word embedding. Both topic modeling and word embedding map documents onto a low-dimensional space, with the former clustering words into a global topic space and the latter mapping word into a local continuous embedding space. Embodiments of Topic Modeling and Sparse Autoencoder (TMSA) framework unify these two complementary patterns by constructing a mutual learning mechanism between word co-occurrence based topic modeling and autoencoder. In embodiments, word topics generated with topic modeling are passed into auto-encoder to impose topic sparsity for the autoencoder to learn topic-relevant word representations. In return, word embedding learned by autoencoder is sent back to topic modeling to improve the quality of topic generations. Performance evaluation on various datasets demonstrates the effectiveness of the disclosed TMSA framework in discovering topics and embedding words.
-
公开(公告)号:US11748613B2
公开(公告)日:2023-09-05
申请号:US16409148
申请日:2019-05-10
申请人: Baidu USA, LLC
发明人: Dingcheng Li , Jingyuan Zhang , Ping Li
IPC分类号: G06F16/93 , G06F16/35 , G06F40/205 , G06F40/30 , G06N3/08 , G06N3/04 , G06N3/044 , G06N3/045
CPC分类号: G06N3/08 , G06F16/353 , G06F16/93 , G06F40/205 , G06F40/30 , G06N3/04 , G06N3/044 , G06N3/045
摘要: Described herein are embodiments for a deep level-wise extreme multi-label learning and classification (XMLC) framework to facilitate the semantic indexing of literatures. In one or more embodiments, the Deep Level-wise XMLC framework comprises two sequential modules, a deep level-wise multi-label learning module and a hierarchical pointer generation module. In one or more embodiments, the first module decomposes terms of domain ontology into multiple levels and builds a special convolutional neural network for each level with category-dependent dynamic max-pooling and macro F-measure based weights tuning. In one or more embodiments, the second module merges the level-wise outputs into a final summarized semantic indexing. The effectiveness of Deep Level-wise XMLC framework embodiments is demonstrated by comparing it with several state-of-the-art methods of automatic labeling on various datasets.
-
公开(公告)号:US11727243B2
公开(公告)日:2023-08-15
申请号:US16262618
申请日:2019-01-30
申请人: Baidu USA, LLC
发明人: Jingyuan Zhang , Dingcheng Li , Ping Li , Xiao Huang
IPC分类号: G06N3/00 , G06F16/901 , G06N3/08 , G06F16/2452 , G06N3/04 , G06N3/006 , G06N3/042 , G06N3/044
CPC分类号: G06N3/006 , G06F16/24522 , G06F16/9024 , G06N3/042 , G06N3/044 , G06N3/08
摘要: Described herein are embodiments for question answering over knowledge graph using a Knowledge Embedding based Question Answering (KEQA) framework. Instead of inferring an input questions' head entity and predicate directly, KEQA embodiments target jointly recovering the question's head entity, predicate, and tail entity representations in the KG embedding spaces. In embodiments, a joint distance metric incorporating various loss terms is used to measure distances of a predicated fact to all candidate facts. In embodiments, the fact with the minimum distance is returned as the answer. Embodiments of a joint training strategy are also disclosed for better performance. Performance evaluation on various datasets demonstrates the effectiveness of the disclosed systems and methods using the KEQA framework.
-
6.
公开(公告)号:US11636355B2
公开(公告)日:2023-04-25
申请号:US16427225
申请日:2019-05-30
申请人: Baidu USA, LLC
发明人: Dingcheng Li , Jingyuan Zhang , Ping Li , Siamak Zamani Dadaneh
IPC分类号: G06F40/289 , G06N5/04 , G06N20/00 , G06F40/20
摘要: Leveraging domain knowledge is an effective strategy for enhancing the quality of inferred low-dimensional representations of documents by topic models. Presented herein are embodiments of a Bayesian nonparametric model that employ knowledge graph (KG) embedding in the context of topic modeling for extracting more coherent topics; embodiments of the model may be referred to as topic modeling with knowledge graph embedding (TMKGE). TMKGE embodiments are hierarchical Dirichlet process (HDP)-based models that flexibly borrow information from a KG to improve the interpretability of topics. Also, embodiments of a new, efficient online variational inference method based on a stick-breaking construction of HDP were developed for TMKGE models, making TMKGE suitable for large document corpora and KGs. Experiments on datasets illustrate the superior performance of TMKGE in terms of topic coherence and document classification accuracy, compared to state-of-the-art topic modeling methods.
-
-
-
-
-