-
公开(公告)号:US20240176778A1
公开(公告)日:2024-05-30
申请号:US18059237
申请日:2022-11-28
Applicant: SAP SE
Inventor: Tetyana Chernenko , Benjamin Schork , Marcus Danei
IPC: G06F16/242 , G06F40/242 , G06F40/30
CPC classification number: G06F16/243 , G06F40/242 , G06F40/30
Abstract: Term ambiguity is resolved by referencing a terminology database. An input is received comprising the term designated as ambiguous, and a string including the term. The term is posed as a query to the terminology database containing metadata of at least one type. Query results are returned including at least two possible meanings. Sequence(s) are extracted from the query results, each sequence including at least two pieces of metadata of a same type—one for each possible meaning of the ambiguous term. The metadata of each entry of a sequence is compared with the query result and corresponding scores are calculated. The scores are compared to determine a final meaning of the ambiguous term. Simpler embodiments considering one type of metadata (one sequence), may calculate and compare a listing of scores. Complex embodiments considering more than one type of metadata (multiple sequences), may calculate and compare a matrix of scores.
-
公开(公告)号:US20240354511A1
公开(公告)日:2024-10-24
申请号:US18304640
申请日:2023-04-21
Applicant: SAP SE
Inventor: Tetyana Chernenko , Benjamin Schork , Marcus Danei
Abstract: Embodiments relate to systems and methods that improve the definition of semantic domains within incoming data, and accurately distribute data over those defined domains. In a particular embodiment, company-specific terminology and data governance (d.g.) domains are used to define “highly semantically loaded” terms within an incoming linguistic data corpus having existing semantic domains assigned thereto. Analyzing distribution patterns of such highly semantically loaded terms across the incoming linguistic data (and/or across the d.g. domains) enhances the accuracy of assignment of semantical domains and distribution of the data across these domains. Such improved semantic domains can improve operation of computers tasked with downstream processing of the linguistic data—e.g., by Natural Language Processing (NLP).
-
公开(公告)号:US11494568B1
公开(公告)日:2022-11-08
申请号:US17229987
申请日:2021-04-14
Applicant: SAP SE
Inventor: Lauritz Brandt , Marcus Danei , Benjamin Schork
IPC: G06F40/58 , G06F40/279 , G06F40/166
Abstract: Systems and methods include acquisition of a plurality of text segments, each of the text segments associated with a flag value indicating whether the text segment is associated with a correct replacement text or an incorrect replacement text, determination of one or more n-grams of each text segment of the plurality of text segments, generation, based on the one or more n-grams of each text segment and the flag value associated with each text segment, a model to determine a flag value based on one or more input n-grams, reception of an input text segment, determination of a second one or more n-grams of the input text segment, determination, using the model, of an output flag value based on the determined second one or more n-grams, and presentation of the input text segment and the output flag value on a display.
-
-