EXPANDING KNOWLEDGE GRAPHS USING EXTERNAL DATA SOURCE

    公开(公告)号:US20210012218A1

    公开(公告)日:2021-01-14

    申请号:US16508038

    申请日:2019-07-10

    IPC分类号: G06N5/02 G06F17/27 G06F16/30

    摘要: An approach is provided that selects an original entity from an original knowledge graph. The approach then accesses a data source that is external to the original knowledge graph, such as an online encyclopedia. An entity in the data source is identified based on the entity matching the original entity. A new relation is then identified in the data source between the identified entity and a new entity with the new entity being absent from the original knowledge graph. An expanded knowledge graph is then generated with the expanded knowledge graph formed by adding the new entity to the original knowledge graph.

    Dynamic linguistic assessment and measurement

    公开(公告)号:US11361031B2

    公开(公告)日:2022-06-14

    申请号:US16154057

    申请日:2018-10-08

    摘要: Embodiments are directed to a system, a computer program product, and a method for identification of linguistically related elements, and more specifically to prediction of a linguistically related element. A linguistic algorithm forms a cluster representation of corpus entries. A linguistic term is identified and applied to the cluster representation to identify proximally related linguistic terms. Associative relationships between the proximally related terms and category metadata are iteratively investigated. One or more linguistic terms related across the two more metadata categories is identified and designated as the linguistically related element.

    GENERATING AND USING A SENTENCE MODEL FOR ANSWER GENERATION

    公开(公告)号:US20220075951A1

    公开(公告)日:2022-03-10

    申请号:US17015663

    申请日:2020-09-09

    IPC分类号: G06F40/30 G06F16/901

    摘要: In an approach to generating and using a sentence model for answer generation, one or more computer processors ingest a first corpus of a plurality of text sentences. One or more computer processors convert the plurality of text sentences into a plurality of sentence vectors. One or more computer processors group the plurality of sentence vectors into a plurality of sentence clusters, wherein a sentence cluster is composed of sentences that are semantically similar. One or more computer processors receive a second corpus. One or more computer processors determine, for each sentence cluster of the plurality of sentence clusters, a frequency each sentence cluster appears in the second corpus. Based on the determined frequency, one or more computer processors calculate a probability of each sentence cluster of the plurality of sentence clusters. Based on the calculated probabilities, one or more computer processors generate a first sentence model.

    Weighting and Expanding Query Terms Based on Language Model Favoring Surprising Words

    公开(公告)号:US20180232374A1

    公开(公告)日:2018-08-16

    申请号:US15619689

    申请日:2017-06-12

    IPC分类号: G06F17/30

    摘要: An approach is provided that receives a question at a question answering (QA) system. The question includes a number of words. The approach operates by calculating weights that correspond to search terms included in the plurality of words. The search terms include the plurality of words and may include terms that are one or more sequences of adjacent words included in the question. Based on the calculated weights and the words in the question, the approach generates a query that is used to search a corpus that is managed by the QA system with the search resulting in one or more search results.

    Weighting and Expanding Query Terms Based on Language Model Favoring Surprising Words

    公开(公告)号:US20180232373A1

    公开(公告)日:2018-08-16

    申请号:US15430597

    申请日:2017-02-13

    IPC分类号: G06F17/30

    摘要: An approach is provided that receives a question at a question answering (QA) system. The question includes a number of words. The approach operates by calculating weights that correspond to search terms included in the plurality of words. The search terms include the plurality of words and may include terms that are one or more sequences of adjacent words included in the question. Based on the calculated weights and the words in the question, the approach generates a query that is used to search a corpus that is managed by the QA system with the search resulting in one or more search results.