Deep learning based automatic ontology extraction to detect new domain knowledge

    公开(公告)号:US11868715B2

    公开(公告)日:2024-01-09

    申请号:US17140360

    申请日:2021-01-04

    CPC classification number: G06F40/20 G06F17/18 G06N3/047

    Abstract: A system processes unstructured data to identify a plurality of subsets of text in a set of text in the unstructured data and determines, for a subset from the plurality of subsets, probabilities based on a position of the subset in the set of text, a part of speech (POS) of each word in the subset, and POSs of one or more words on left and right hand sides of the subset, a number of the one or more words being selected based on a length of the set of text. The system generates a feature vector for the subset, the feature vector including the probabilities and additional features of the subset; and classifies, using a classifier, the subset into one of a plurality of classes based on the feature vector for the subset, the plurality of classes representing an ontology of a domain of knowledge.

Patent Agency Ranking