Deep learning based automatic ontology extraction to detect new domain knowledge

    公开(公告)号:US11868715B2

    公开(公告)日:2024-01-09

    申请号:US17140360

    申请日:2021-01-04

    CPC classification number: G06F40/20 G06F17/18 G06N3/047

    Abstract: A system processes unstructured data to identify a plurality of subsets of text in a set of text in the unstructured data and determines, for a subset from the plurality of subsets, probabilities based on a position of the subset in the set of text, a part of speech (POS) of each word in the subset, and POSs of one or more words on left and right hand sides of the subset, a number of the one or more words being selected based on a length of the set of text. The system generates a feature vector for the subset, the feature vector including the probabilities and additional features of the subset; and classifies, using a classifier, the subset into one of a plurality of classes based on the feature vector for the subset, the plurality of classes representing an ontology of a domain of knowledge.

    Methodology for generating a consistent semantic model by filtering and fusing multi-source ontologies

    公开(公告)号:US10678834B2

    公开(公告)日:2020-06-09

    申请号:US15422540

    申请日:2017-02-02

    Abstract: A system, for filtering and fusing multi-source ontologies. The system includes a tangible processing controller unit and non-transitory computer-readable storage device in communication with the tangible processing controller unit. The storage device includes a first receiving unit that, when executed by the tangible processing control unit, receives a plurality of ontologies, each ontology having a set of rules and a class structure with a plurality of data classes. The storage device also includes a second receiving unit that, when executed, receives data. The device also includes a comparison unit that compares the data classes from the plurality of ontologies, and a merging unit that merges the data classes that are identical or consistent into a new data class. The storage device also includes a discarding unit that discards the data classes that are inconsistent. The storage device also includes a new-set-generation unit that generates a new set of class structure.

    METHODOLOGY FOR GENERATING A CONSISTENT SEMANTIC MODEL BY FILTERING AND FUSING MULTI-SOURCE ONTOLOGIES

    公开(公告)号:US20180218071A1

    公开(公告)日:2018-08-02

    申请号:US15422540

    申请日:2017-02-02

    CPC classification number: G06F16/367 G06N5/022 G06N7/005 G06N20/00

    Abstract: A system, for filtering and fusing multi-source ontologies. The system includes a tangible processing controller unit and non-transitory computer-readable storage device in communication with the tangible processing controller unit. The storage device includes a first receiving unit that, when executed by the tangible processing control unit, receives a plurality of ontologies, each ontology having a set of rules and a class structure with a plurality of data classes. The storage device also includes a second receiving unit that, when executed, receives data. The device also includes a comparison unit that compares the data classes from the plurality of ontologies, and a merging unit that merges the data classes that are identical or consistent into a new data class. The storage device also includes a discarding unit that discards the data classes that are inconsistent. The storage device also includes a new-set-generation unit that generates a new set of class structure.

    Semantic similarity analysis to determine relatedness of heterogeneous data

    公开(公告)号:US10482178B2

    公开(公告)日:2019-11-19

    申请号:US15672643

    申请日:2017-08-09

    Abstract: A method and system to determine relatedness select a first customer observable from a first source document, the first customer observable being made up of two terms, the two terms being a first term of a first type and a first term of a second type, and select a second customer observable from a second source document, the second customer observable being made up of a second term of the first type and a second term of the second type. The method includes creating a first corpus of all documents that include the first terms, creating a second corpus of all documents that include the second terms, obtaining other first terms in the first corpus and other second in the second corpus, and performing semantic similarity analysis to determine a similarity score between the first customer observable and the second customer observable.

    SEMANTIC SIMILARITY ANALYSIS TO DETERMINE RELATEDNESS OF HETEROGENEOUS DATA

    公开(公告)号:US20190050394A1

    公开(公告)日:2019-02-14

    申请号:US15672643

    申请日:2017-08-09

    CPC classification number: G06F17/2785 G06F16/285 G06F16/93

    Abstract: A method and system to determine relatedness select a first customer observable from a first source document, the first customer observable being made up of two terms, the two terms being a first term of a first type and a first term of a second type, and select a second customer observable from a second source document, the second customer observable being made up of a second term of the first type and a second term of the second type. The method includes creating a first corpus of all documents that include the first terms, creating a second corpus of all documents that include the second terms, obtaining other first terms in the first corpus and other second in the second corpus, and performing semantic similarity analysis to determine a similarity score between the first customer observable and the second customer observable.

Patent Agency Ranking