METHOD FOR A SOFTWARE DEVELOPMENT SYSTEM
    21.
    发明申请

    公开(公告)号:US20200183681A1

    公开(公告)日:2020-06-11

    申请号:US16657118

    申请日:2019-10-18

    Applicant: SAP SE

    Abstract: The present disclosure relates to a method for a software development system, the software development system comprising a code repository storing source code. The method comprises: receiving at the code repository an additional code; receiving at one or more documentation repositories documentation information for documenting the source code; generating corpus-based semantic word embeddings for code and documentation words of the source code and the documentation information; using the word embeddings for mapping by the software development system the source code to corresponding documentation; storing the mapping of the source code to the corresponding documentation.

    MODEL-BASED ANALYSIS IN A RELATIONAL DATABASE

    公开(公告)号:US20190073570A1

    公开(公告)日:2019-03-07

    申请号:US16054242

    申请日:2018-08-03

    Applicant: SAP SE

    Abstract: A system includes a model repository comprising a plurality of models respectively being adapted to perform, when used by an analytical program, a computational task, in which a first database table is created in the database, the first database table having a predefined table structure that corresponds to the analytical program, a best-model of the plurality of models is stored in the first database table, and a request of a client device to perform the computational task and comprising input data is received. If the received request does not comprise a model-ID, the analytical program reads the model currently stored in the first table and uses the read model for performing the computational task on the input data. If the received request comprises a model-ID, the analytical program creates a second database table having the predefined table structure in the database, reads a model associated with the model-ID from the model repository, stores the read model associated with the model-ID in the second table, and uses the model read from the second table for performing the computational task on the input data.

    Smart dataset collection system
    23.
    发明授权

    公开(公告)号:US11874798B2

    公开(公告)日:2024-01-16

    申请号:US17486554

    申请日:2021-09-27

    Applicant: SAP SE

    CPC classification number: G06F16/164 G06F16/1873 G06F16/345 G06F40/40

    Abstract: Datasets are available from different dataset servers and often lack well-defined metadata. Thus, comparing datasets is difficult. Additionally, there might be different versions of the same dataset which makes the search even more difficult. Using systems and methods described herein, quality scores, dataset versioning, topic identification, and semantic relatedness metadata is stored about datasets stored on dataset servers. A user interface is provided to allow a user to search for datasets by specifying search criteria (e.g., a topic and a minimum quality score) and to be informed of responsive datasets. The user interface may further inform the user of the quality scores of the responsive datasets, the versions of the responsive datasets, or other metadata. From the search results, the user may select and download one or more of the responsive datasets.

    OCR using 3-dimensional interpolation

    公开(公告)号:US11837000B1

    公开(公告)日:2023-12-05

    申请号:US17746451

    申请日:2022-05-17

    Applicant: SAP SE

    CPC classification number: G06V30/16 G06V30/1801

    Abstract: To perform 3-dimensional interpolation, a 3-dimensional model of an input text character is generated. For example, a 2-dimensional character may be given depth using an extrusion transformation. The 3-dimensional model of the input text character is compared to 3-dimensional models of candidate characters and the results of the 3-dimensional comparisons are used to select the optical character recognition (OCR) output for the input text character. The 3-dimensional comparison may be performed directly on the 3-dimensional models. Alternatively, a set of 2-dimensional images may be generated for each 3-dimensional model and 2-dimensional comparisons performed. By use of the additional information gathered from the comparisons of the 3-dimensional models, the correct OCR output character can be identified with greater confidence. As a result, the quality of the OCR output is improved, improving the functioning of a computer performing OCR tasks and reducing the expenditure of time and processing power in correcting OCR errors.

    AUDIO FILE ANNOTATION
    25.
    发明申请

    公开(公告)号:US20230094828A1

    公开(公告)日:2023-03-30

    申请号:US17486661

    申请日:2021-09-27

    Applicant: SAP SE

    Abstract: Text-to-speech translation is used to generate a transcript for an audio file. Text segments are associated with time segments in the transcript. A trained machine learning model determines, based on the text in the transcript, one or more topics for the audio file. The transcript is modified to include the determined one or more topics. A user interface may be presented that allows a user to search for portions of an audio file that relate to a particular topic. In response to the selected or entered topic, the user interface presents segments having a matching topic. The user may use voice or other user interface commands to modify the annotation of the audio file. User commands may also be used to extract data from the transcript and copy the data to a clipboard or to another application.

    SMART DOCUMENT MANAGEMENT
    26.
    发明申请

    公开(公告)号:US20230062307A1

    公开(公告)日:2023-03-02

    申请号:US17404428

    申请日:2021-08-17

    Applicant: SAP SE

    Abstract: Files are automatically named based on their contents and metadata. Contents include words in a text file, text recognized using optical character recognition (OCR) in an image file, and objects recognized using object recognition in an image file. Metadata includes creation date, modification date, user owning the file, file type, and file extension. Multiple files may be processed. A file sorter may determine an order in which to process the multiple files. For example, smaller files may be processed first. In addition to using the words discussed above to name the file, the file may be tagged based on the contents of the file. A search function for files may search both names and tags to identify responsive files. Two or more files may be linked based on their contents or metadata.

    MEASURING DOCUMENTATION COMPLETENESS IN MULTIPLE LANGUAGES

    公开(公告)号:US20220365776A1

    公开(公告)日:2022-11-17

    申请号:US17317340

    申请日:2021-05-11

    Applicant: SAP SE

    Abstract: Source code is analyzed to identify components. The components are each assigned a complexity score. Documentation for the source code is identified, related to the components, and given a score based on the quantity of the documentation for the component and the complexity score for the component. To determine semantic meaning of the documentation, vector embeddings for the documentation languages may be generated and aligned. Alignment causes the different machine learning models to generate similar vectors for semantically similar words in the different languages. Since the vectors of the words of the other languages are similar to the vectors of the words in a primary language with similar meanings, the vector representation of the documentation in the other languages will match the vector representation of the source code when the documentation is substantially on the same topic.

    Identifying attributes in unstructured data files using a machine-learning model

    公开(公告)号:US11461680B2

    公开(公告)日:2022-10-04

    申请号:US16880696

    申请日:2020-05-21

    Applicant: SAP SE

    Abstract: Provided herein are a system, apparatus, device, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for identifying attributes in unstructured data files using a machine-learning model. In an embodiment, a server can receive a request to identify an attribute associated with a set of unstructured data files. The server can extract a first and second subset of features from each unstructured data file of the set of unstructured data files. The server can identify the attribute in the set of unstructured data files request based on each of the first and second subset of features using the machine-learning model.

    Domain similarity scores for information retrieval

    公开(公告)号:US10380163B2

    公开(公告)日:2019-08-13

    申请号:US15379442

    申请日:2016-12-14

    Applicant: SAP SE

    Abstract: Various embodiments of systems, computer program products, and methods to provide domain similarity scores for information retrieval are described herein. In an aspect, a plurality of files associated with a plurality of domains are retrieved. A corpus corresponding to the plurality of domains is generated based on the plurality of files by integrating the plurality of files corresponding to the plurality of domains. Further, similarity scores between the plurality of domains are determined based on the generated corpus. The similarity scores between the plurality of domains are persisted in a similarity scores repository to enable information retrieval during translating data between different languages.

    MACHINE LEARNING DRIVEN DATA MANAGEMENT
    30.
    发明申请

    公开(公告)号:US20190244094A1

    公开(公告)日:2019-08-08

    申请号:US15890184

    申请日:2018-02-06

    Applicant: SAP SE

    Abstract: A system for machine learning driven data management is provided. In some implementations, the system performs operations including receiving, by a neural network, first and second textual data associated with a first item and a second item. The operations further include converting, by the neural network, the first and second textual data to a first vector and a second vector. The operations further include determining, by the neural network, whether the first item and the second item satisfy, based on a comparison of the first vector with the second vector, a similarity threshold. The operations further include selecting, by the neural network and in response to satisfaction of the similarity threshold, one of the first item and the second item, the selecting based on a selection criteria. The operations further include providing, by the neural network, a recommendation on a user interface regarding the selected first item or second item.

Patent Agency Ranking