System and Method Performing Terminology Disambiguation

    公开(公告)号:US20240176778A1

    公开(公告)日:2024-05-30

    申请号:US18059237

    申请日:2022-11-28

    Applicant: SAP SE

    CPC classification number: G06F16/243 G06F40/242 G06F40/30

    Abstract: Term ambiguity is resolved by referencing a terminology database. An input is received comprising the term designated as ambiguous, and a string including the term. The term is posed as a query to the terminology database containing metadata of at least one type. Query results are returned including at least two possible meanings. Sequence(s) are extracted from the query results, each sequence including at least two pieces of metadata of a same type—one for each possible meaning of the ambiguous term. The metadata of each entry of a sequence is compared with the query result and corresponding scores are calculated. The scores are compared to determine a final meaning of the ambiguous term. Simpler embodiments considering one type of metadata (one sequence), may calculate and compare a listing of scores. Complex embodiments considering more than one type of metadata (multiple sequences), may calculate and compare a matrix of scores.

    Semantic Domain Assignment Referencing Governance Domains and Term Databases

    公开(公告)号:US20240354511A1

    公开(公告)日:2024-10-24

    申请号:US18304640

    申请日:2023-04-21

    Applicant: SAP SE

    CPC classification number: G06F40/30 G06F40/58

    Abstract: Embodiments relate to systems and methods that improve the definition of semantic domains within incoming data, and accurately distribute data over those defined domains. In a particular embodiment, company-specific terminology and data governance (d.g.) domains are used to define “highly semantically loaded” terms within an incoming linguistic data corpus having existing semantic domains assigned thereto. Analyzing distribution patterns of such highly semantically loaded terms across the incoming linguistic data (and/or across the d.g. domains) enhances the accuracy of assignment of semantical domains and distribution of the data across these domains. Such improved semantic domains can improve operation of computers tasked with downstream processing of the linguistic data—e.g., by Natural Language Processing (NLP).

    Text verticalization categorization

    公开(公告)号:US11494568B1

    公开(公告)日:2022-11-08

    申请号:US17229987

    申请日:2021-04-14

    Applicant: SAP SE

    Abstract: Systems and methods include acquisition of a plurality of text segments, each of the text segments associated with a flag value indicating whether the text segment is associated with a correct replacement text or an incorrect replacement text, determination of one or more n-grams of each text segment of the plurality of text segments, generation, based on the one or more n-grams of each text segment and the flag value associated with each text segment, a model to determine a flag value based on one or more input n-grams, reception of an input text segment, determination of a second one or more n-grams of the input text segment, determination, using the model, of an output flag value based on the determined second one or more n-grams, and presentation of the input text segment and the output flag value on a display.

Patent Agency Ranking