-
公开(公告)号:US20230359825A1
公开(公告)日:2023-11-09
申请号:US17738898
申请日:2022-05-06
Applicant: SAP SE
Inventor: Hans-Martin Ramsl
IPC: G06F40/295 , G06N3/08 , G06N3/04 , G06N5/02 , G06F40/253
CPC classification number: G06F40/295 , G06N3/08 , G06N3/0427 , G06N5/022 , G06F40/253
Abstract: Example methods and systems are directed to generating knowledge graph entities from text. Natural language text is received as input and processed using named entity recognition (NER), part of speech (POS) recognition, and business object recognition (BOR). The outputs of the NER, POS, and BOR processes are combined to generate knowledge entity triples comprising two entities and a relationship between them. Keywords are extracted from the text using NER to generate a set of entities. A node in a knowledge graph is created for at least some of the entities. A POS tagger identifies verbs in the text, generating a set of verbs. Relational verbs (e.g., “talk to” or “communicated with”) are detected and used to create edges in the knowledge graph. The knowledge graph may be converted back to natural language text using a trained machine learning model.
-
公开(公告)号:US20230139644A1
公开(公告)日:2023-05-04
申请号:US17513188
申请日:2021-10-28
Applicant: SAP SE
Inventor: Hans-Martin Ramsl
IPC: G06F40/30 , G06F16/36 , G06F16/33 , G06N20/00 , G06F40/247
Abstract: Systems, methods, and computer-readable media are disclosed for list attribute normalization and standardization for creation of a controlled vocabulary. A vocabulary set comprising a plurality of vocabulary term may be received. For each vocabulary term, semantic duplicates may be identified. The semantic duplicates may be identified by analyzing semantics, syntactics, or phonetics of the vocabulary terms. Semantic chains may be formed from each vocabulary term and the corresponding semantic duplicates. The terms in each semantic chain may be ranked to determine a most probable vocabulary term. The most probable vocabulary term may then replace the semantic chain. The most probable vocabulary term across all semantic chains from the vocabulary set may form the controlled vocabulary.
-
公开(公告)号:US20230133030A1
公开(公告)日:2023-05-04
申请号:US17516948
申请日:2021-11-02
Applicant: SAP SE
Inventor: Ran M. Bittmann , Hans-Martin Ramsl
IPC: G06K9/62
Abstract: Systems, methods, and computer-readable media are disclosed for visual labeling of training data items for training a machine learning model. Training data items may be generated for training the machine learning model. Visual labels, such as QR codes, may be created for the training data items. The creation of the training data item and the visual label may be automated. The visual labels and the training data items may be combined to obtain a labeled training data item. The labeled training data item may comprise a separator to distinguish the training data item from the visual label. The labeled training data item may be used for training and validation of the machine learning model. The machine learning model may analyze the training data item, attempt to identify the training data item, and compare the identification against the embedded label.
-
公开(公告)号:US11620127B2
公开(公告)日:2023-04-04
申请号:US17317340
申请日:2021-05-11
Applicant: SAP SE
Inventor: Hans-Martin Ramsl , Priyanshu Shukla
Abstract: Source code is analyzed to identify components. The components are each assigned a complexity score. Documentation for the source code is identified, related to the components, and given a score based on the quantity of the documentation for the component and the complexity score for the component. To determine semantic meaning of the documentation, vector embeddings for the documentation languages may be generated and aligned. Alignment causes the different machine learning models to generate similar vectors for semantically similar words in the different languages. Since the vectors of the words of the other languages are similar to the vectors of the words in a primary language with similar meanings, the vector representation of the documentation in the other languages will match the vector representation of the source code when the documentation is substantially on the same topic.
-
公开(公告)号:US20230096118A1
公开(公告)日:2023-03-30
申请号:US17486554
申请日:2021-09-27
Applicant: SAP SE
Inventor: Hans-Martin Ramsl
Abstract: Datasets are available from different dataset servers and often lack well-defined metadata. Thus, comparing datasets is difficult. Additionally, there might be different versions of the same dataset which makes the search even more difficult. Using systems and methods described herein, quality scores, dataset versioning, topic identification, and semantic relatedness metadata is stored about datasets stored on dataset servers. A user interface is provided to allow a user to search for datasets by specifying search criteria (e.g., a topic and a minimum quality score) and to be informed of responsive datasets. The user interface may further inform the user of the quality scores of the responsive datasets, the versions of the responsive datasets, or other metadata. From the search results, the user may select and download one or more of the responsive datasets.
-
公开(公告)号:US12242808B2
公开(公告)日:2025-03-04
申请号:US17738898
申请日:2022-05-06
Applicant: SAP SE
Inventor: Hans-Martin Ramsl
IPC: G06F40/295 , G06F40/253 , G06N3/042 , G06N3/08 , G06N5/022
Abstract: Example methods and systems are directed to generating knowledge graph entities from text. Natural language text is received as input and processed using named entity recognition (NER), part of speech (POS) recognition, and business object recognition (BOR). The outputs of the NER, POS, and BOR processes are combined to generate knowledge entity triples comprising two entities and a relationship between them. Keywords are extracted from the text using NER to generate a set of entities. A node in a knowledge graph is created for at least some of the entities. A POS tagger identifies verbs in the text, generating a set of verbs. Relational verbs (e.g., “talk to” or “communicated with”) are detected and used to create edges in the knowledge graph. The knowledge graph may be converted back to natural language text using a trained machine learning model.
-
公开(公告)号:US20240012936A1
公开(公告)日:2024-01-11
申请号:US17862091
申请日:2022-07-11
Applicant: SAP SE
Inventor: Hans-Martin Ramsl
CPC classification number: G06F21/6254 , G06F21/6209 , G06T7/11 , G06T11/60 , G06T2207/20021
Abstract: An input image is divided into segments. The segments may be reassembled to reform the input image. The order of the segments may be stored in an encrypted database for which approved applications have the decryption key but users do not. This allows the approved applications to determine the order and reform the input image without allowing users to do the same. To further increase the difficulty of reforming the input image, the segments may be transformed. Example transformations include rotation and mirroring. The encrypted database may store an indication of the transformation applied to each segment. The effort of reforming the input image without access to the database is increased substantially. The reformed input image may be stored in transient memory only, without being stored to long-term storage. Thus, the reformed image cannot be accessed from a file system by unauthorized users.
-
公开(公告)号:US20230040412A1
公开(公告)日:2023-02-09
申请号:US17395213
申请日:2021-08-05
Applicant: SAP SE
Inventor: Hans-Martin Ramsl
Abstract: A machine learning model is trained to translate source code from one or more programming languages into a common programming language. The machine learning model translates source code from the other languages into the common programming language. A language embedder generates a vector for each function in the source code, all of which is now in the common programming language. A user provides a text search query which is converted by a language embedder to a vector. Based on the vector of the text search query and the vectors for the source code, search results are generated and presented in a user interface. Additional machine learning models may be trained and used to measure function complexity, test coverage, documentation quantity and complexity, or any suitable combination thereof. These measures may be used to determine which search results to present, an order in which to present search results, or both.
-
公开(公告)号:US20220067364A1
公开(公告)日:2022-03-03
申请号:US17009526
申请日:2020-09-01
Applicant: SAP SE
Inventor: Hans-Martin Ramsl
Abstract: In an example embodiment, machine learning is used to intelligently compress documents to reduce the overall footprint of storing large amounts of files for an organization. Specifically, a document is split into parts, with each part representing a grouping of text or an image. Optical character recognition is performed to identify the text in images. Machine learning techniques are then applied to a part of a document in order to determine how relevant the document is for the organization. The parts that are deemed to be not relevant may then be reduced in size, either by omitting them completely or by summarizing them. This allows for the compression to be tailored specifically to the organization, resulting in the ability to compress or eliminate parts of documents that other organizations might have found relevant (and thus would not have been compressed or eliminated through traditional means).
-
公开(公告)号:US11157780B2
公开(公告)日:2021-10-26
申请号:US16054242
申请日:2018-08-03
Applicant: SAP SE
Inventor: Vincenzo Turco , Annika Berger , Hans-Martin Ramsl
IPC: G06F11/07 , G06F9/50 , G06K9/62 , G06F16/28 , G06F16/22 , G06N20/00 , G06F9/48 , G06F16/26 , G06F16/2453 , G06F16/21
Abstract: A system includes a model repository comprising a plurality of models respectively being adapted to perform, when used by an analytical program, a computational task, in which a first database table is created in the database, the first database table having a predefined table structure that corresponds to the analytical program, a best-model of the plurality of models is stored in the first database table, and a request of a client device to perform the computational task and comprising input data is received. If the received request does not comprise a model-ID, the analytical program reads the model currently stored in the first table and uses the read model for performing the computational task on the input data. If the received request comprises a model-ID, the analytical program creates a second database table having the predefined table structure in the database, reads a model associated with the model-ID from the model repository, stores the read model associated with the model-ID in the second table, and uses the model read from the second table for performing the computational task on the input data.
-
-
-
-
-
-
-
-
-