专利检索 ap:("Baidu USA, LLC") AND inv:"Ping Li" 第 1 页

1.

发明授权
Systems and methods for large scale semantic indexing with deep level-wise extreme multi-label learning 有权

公开(公告)号：US11748613B2

公开(公告)日：2023-09-05

申请号：US16409148

申请日：2019-05-10

申请人： Baidu USA, LLC

发明人： Dingcheng Li , Jingyuan Zhang , Ping Li

IPC分类号： G06F16/93 , G06F16/35 , G06F40/205 , G06F40/30 , G06N3/08 , G06N3/04 , G06N3/044 , G06N3/045

CPC分类号： G06N3/08 , G06F16/353 , G06F16/93 , G06F40/205 , G06F40/30 , G06N3/04 , G06N3/044 , G06N3/045

摘要： Described herein are embodiments for a deep level-wise extreme multi-label learning and classification (XMLC) framework to facilitate the semantic indexing of literatures. In one or more embodiments, the Deep Level-wise XMLC framework comprises two sequential modules, a deep level-wise multi-label learning module and a hierarchical pointer generation module. In one or more embodiments, the first module decomposes terms of domain ontology into multiple levels and builds a special convolutional neural network for each level with category-dependent dynamic max-pooling and macro F-measure based weights tuning. In one or more embodiments, the second module merges the level-wise outputs into a final summarized semantic indexing. The effectiveness of Deep Level-wise XMLC framework embodiments is demonstrated by comparing it with several state-of-the-art methods of automatic labeling on various datasets.

2.

发明授权
Knowledge-graph-embedding-based question answering 有权

公开(公告)号：US11727243B2

公开(公告)日：2023-08-15

申请号：US16262618

申请日：2019-01-30

申请人： Baidu USA, LLC

发明人： Jingyuan Zhang , Dingcheng Li , Ping Li , Xiao Huang

IPC分类号： G06N3/00 , G06F16/901 , G06N3/08 , G06F16/2452 , G06N3/04 , G06N3/006 , G06N3/042 , G06N3/044

CPC分类号： G06N3/006 , G06F16/24522 , G06F16/9024 , G06N3/042 , G06N3/044 , G06N3/08

摘要： Described herein are embodiments for question answering over knowledge graph using a Knowledge Embedding based Question Answering (KEQA) framework. Instead of inferring an input questions' head entity and predicate directly, KEQA embodiments target jointly recovering the question's head entity, predicate, and tail entity representations in the KG embedding spaces. In embodiments, a joint distance metric incorporating various loss terms is used to measure distances of a predicated fact to all candidate facts. In embodiments, the fact with the minimum distance is returned as the answer. Embodiments of a joint training strategy are also disclosed for better performance. Performance evaluation on various datasets demonstrates the effectiveness of the disclosed systems and methods using the KEQA framework.

3.

发明授权
Integration of knowledge graph embedding into topic modeling with hierarchical Dirichlet process 有权

公开(公告)号：US11636355B2

公开(公告)日：2023-04-25

申请号：US16427225

申请日：2019-05-30

申请人： Baidu USA, LLC

发明人： Dingcheng Li , Jingyuan Zhang , Ping Li , Siamak Zamani Dadaneh

IPC分类号： G06F40/289 , G06N5/04 , G06N20/00 , G06F40/20

摘要： Leveraging domain knowledge is an effective strategy for enhancing the quality of inferred low-dimensional representations of documents by topic models. Presented herein are embodiments of a Bayesian nonparametric model that employ knowledge graph (KG) embedding in the context of topic modeling for extracting more coherent topics; embodiments of the model may be referred to as topic modeling with knowledge graph embedding (TMKGE). TMKGE embodiments are hierarchical Dirichlet process (HDP)-based models that flexibly borrow information from a KG to improve the interpretability of topics. Also, embodiments of a new, efficient online variational inference method based on a stick-breaking construction of HDP were developed for TMKGE models, making TMKGE suitable for large document corpora and KGs. Experiments on datasets illustrate the superior performance of TMKGE in terms of topic coherence and document classification accuracy, compared to state-of-the-art topic modeling methods.

4.

发明授权
Cross-lingual language models and pretraining of cross-lingual language models 有权

公开(公告)号：US11886446B2

公开(公告)日：2024-01-30

申请号：US17575650

申请日：2022-01-14

申请人： Baidu USA, LLC

发明人： Hongliang Fei , Puxuan Yu , Ping Li

IPC分类号： G06F17/00 , G06F7/00 , G06F16/2457 , G06F40/58 , G06F40/284

CPC分类号： G06F16/24578 , G06F40/284 , G06F40/58

摘要： Existing research on cross-lingual retrieval cannot take good advantage of large-scale pretrained language models, such as multilingual BERT and XLM. The absence of cross-lingual passage-level relevance data for finetuning and the lack of query-document style pretraining are some of the key factors of this issue. Accordingly, embodiments of two novel retrieval-oriented pretraining tasks are presented herein to further pretrain cross-lingual language models for downstream retrieval tasks, such as cross-lingual ad-hoc retrieval (CUR) and cross-lingual question answering (CLQA). In one or more embodiments, distant supervision data was constructed from multilingual texts using section alignment to support retrieval-oriented language model pretraining. In one or more embodiments, directly finetuning language models on part of an evaluation collection was performed by making Transformers capable of accepting longer sequences. Experiments show that model embodiments significantly improve upon general multilingual language models in at least the cross-lingual retrieval setting and the cross-lingual transfer setting.

5.

发明授权
Total correlation variational autoencoder strengthened with attentions for segmenting syntax and semantics 有权

公开(公告)号：US11748567B2

公开(公告)日：2023-09-05

申请号：US16926525

申请日：2020-07-10

申请人： Baidu USA, LLC

发明人： Dingcheng Li , Shaogang Ren , Ping Li

IPC分类号： G06F40/10 , G06F40/284 , G06F40/211 , G06F40/30 , G06N3/08

CPC分类号： G06F40/284 , G06F40/211 , G06F40/30 , G06N3/08

摘要： Described herein are embodiments of a framework named as total correlation variational autoencoder (TC_VAE) to disentangle syntax and semantics by making use of total correlation penalties of KL divergences. One or more Kullback-Leibler (KL) divergence terms in a loss for a variational autoencoder are discomposed so that generated hidden variables may be separated. Embodiments of the TC_VAE framework were examined on semantic similarity tasks and syntactic similarity tasks. Experimental results show that better disentanglement between syntactic and semantic representations have been achieved compared with state-of-the-art (SOTA) results on the same data sets in similar settings.

6.

发明授权
Coreference-aware representation learning for neural named entity recognition 有权

公开(公告)号：US11354506B2

公开(公告)日：2022-06-07

申请号：US16526614

申请日：2019-07-30

申请人： Baidu USA, LLC

发明人： Hongliang Fei , Zeyu Dai , Ping Li

IPC分类号： G06F40/295 , G06N3/08 , G06N3/02 , G06N20/00 , G06N5/04 , G06F17/18

摘要： Previous neural network models that perform named entity recognition (NER) typically treat the input sentences as a linear sequence of words but ignore rich structural information, such as the coreference relations among non-adjacent words, phrases, or entities. Presented herein are novel approaches to learn coreference-aware word representations for the NER task. In one or more embodiments, a “CNN-BiLSTM-CRF” neural architecture is modified to include a coreference layer component on top of the BiLSTM layer to incorporate coreferential relations. Also, in one or more embodiments, a coreference regularization is added during training to ensure that the coreferential entities share similar representations and consistent predictions within the same coreference cluster. A model embodiment achieved new state-of-the-art performance when tested.

7.

发明授权
Fast neural ranking on bipartite graph indices 有权

公开(公告)号：US12056133B2

公开(公告)日：2024-08-06

申请号：US17555316

申请日：2021-12-17

申请人： Baidu USA, LLC

发明人： Shulong Tan , Weijie Zhao , Ping Li

IPC分类号： G06F16/2457 , G06F16/901

CPC分类号： G06F16/24578 , G06F16/9024

摘要： Presented are systems and methods that construct BipartitE Graph INdices (BEGIN) embodiments for fast neural ranking. BEGIN embodiments comprise two types of nodes: sampled queries and base or searching objects. In one or more embodiments, edges connecting these nodes are constructed by using a neural network ranking measure. These embodiments extend traditional search-on-graph methods and lend themselves to fast neural ranking. Experimental results demonstrate the effectiveness and efficiency of such embodiments.

8.

发明授权
Proximity graph maintenance for fast online nearest neighbor search 有权

公开(公告)号：US12050646B2

公开(公告)日：2024-07-30

申请号：US17408146

申请日：2021-08-20

申请人： Baidu USA, LLC

发明人： Shulong Tan , Zhaozhuo Xu , Weijie Zhao , Zhixin Zhou , Ping Li

IPC分类号： G06F16/901 , G06F16/22

CPC分类号： G06F16/9024 , G06F16/2272

摘要： Incremental proximity graph maintenance (IPGM) systems and methods for online ANN search support both online vertex deletion and insertion of vertices on proximity graphs. In various embodiments, updating a proximity graph comprises receiving a workload that represents a set of vertices in the proximity graph, each vertex being associated with a type of operation such as a query, insertion, or deletion. For a query or an insertion, a search may be executed on the graph to obtain a set of top-K vertices for each vertex. In the case of a deletion, a vertex may be deleted from the proximity graph, and a local or global reconnection update method may be used to reconstruct at least a portion of the proximity graph.

9.

发明授权
Transformation for fast inner product search on graph 有权

公开(公告)号：US11989233B2

公开(公告)日：2024-05-21

申请号：US17033791

申请日：2020-09-27

申请人： Baidu USA, LLC

发明人： Shulong Tan , Zhixin Zhou , Zhaozhuo Xu , Ping Li

IPC分类号： G06F16/901 , G06F16/903 , G06F17/16

CPC分类号： G06F16/9024 , G06F16/90335 , G06F17/16

摘要： Presented herein are embodiments of a fast search on graph methodology for Maximum Inner Product Search (MIPS). This optimization problem is challenging since traditional Approximate Nearest Neighbor (ANN) search methods may not perform efficiently in the nonmetric similarity measure. Embodiments herein are based on the property that a Möbius/Möbius-like transformation introduces an isomorphism between a subgraph of 2-Delaunay graph and Delaunay graph for inner product. Under this observation, embodiments of a novel graph indexing and searching methodology are presented to find the optimal solution with the largest inner product with the query. Experiments show significant improvements compared to existing methods.

10.

发明授权
Approximate nearest neighbor search for single instruction, multiple thread (SIMT) or single instruction, multiple data (SIMD) type processors 有权

公开(公告)号：US11914669B2

公开(公告)日：2024-02-27

申请号：US17095548

申请日：2020-11-11

申请人： Baidu USA, LLC

发明人： Weijie Zhao , Shulong Tan , Ping Li

IPC分类号： G06F17/10 , G06F9/30 , G06F9/38 , G06F9/48 , G06F18/2323 , G06F18/2413

CPC分类号： G06F17/10 , G06F9/3009 , G06F9/3887 , G06F9/4881 , G06F18/2323 , G06F18/24147

摘要： Approximate nearest neighbor (ANN) searching is a fundamental problem in computer science with numerous applications in area such as machine learning and data mining. For typical graph-based ANN methods, the searching method is executed iteratively, and the execution dependency prohibits graphics processor unit (GPU)/GPU-type processor adaptations. Presented herein are embodiments of a novel framework that decouples the searching on graph methodology into stages, in order to parallel the performance-crucial distance computation. Furthermore, in one or more embodiments, to obtain better parallelism on GPU-type components, also disclosed are novel ANN-specific optimization methods that eliminate dynamic memory allocations and trade computations for less memory consumption. Embodiments were empirically compared against other methods, and the results confirm the effectiveness.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类