Patent search ap:("SAP SE") AND inv:"Sohyeong Kim" Page 1

1.

发明授权
Image search and training system 有权

公开(公告)号：US10885385B2

公开(公告)日：2021-01-05

申请号：US16357720

申请日：2019-03-19

Applicant: SAP SE

Inventor： Sohyeong Kim , Eduardo Vellasques , Urko Sanchez Sanz

IPC: G06K9/00 , G06K9/62 , G06F16/532 , G06F16/56 , G06N3/08

Abstract: Disclosed herein are system, method, and computer program product embodiments for providing an image search training system. An embodiment operates by determining a query image on which to train an image search system, and a positive image visually similar to the query image. A set of negative images from a plurality of negative images visually dissimilar to the query image are selected, where the selected set of negative images includes both a first negative image and a second negative image. A first similarity measure between the first negative image and the positive image, and a second similarity measure between the second negative image and the positive image are calculated. The first negative image is selected based on the first similarity measure being less than the second similarity measure. The query image, the positive image, and the first negative image are provided to the image search system for training.

2.

发明公开
TRAINING OF MACHINE LEARNING MODELS USING CONTENT MASKING TECHNIQUES 审中-公开

公开(公告)号：US20240338957A1

公开(公告)日：2024-10-10

申请号：US18130955

申请日：2023-04-05

Applicant: SAP SE

Inventor： Sohyeong Kim

IPC: G06V30/19 , G06N3/045 , G06N3/09 , G06V10/82 , G06V30/14

CPC classification number: G06V30/19147 , G06N3/045 , G06N3/09 , G06V10/82 , G06V30/1448 , G06V30/19007

Abstract: A method for training machine learning model is provided. The method comprises extracting texts and locations of the texts from a document, generating embeddings for the document, a first set of the embeddings characterizing a first subset of the texts and locations of the first subset of the texts and a second set of the embeddings characterizing a second subset of the texts that are masked and additional locations of the second subset of the texts that are masked, generating additional embeddings characterizing contents of the second subset of the texts, generating relevance values based on a comparison, identifying, for each of the additional locations of the second subset of the texts that are masked, a respective content of the second subset of the texts having a reference value that is higher than a remaining relevance values, and outputting each of the respective content.

3.

发明申请
VIRTUAL ADVERSARIAL TRAINING FOR DOCUMENT INFORMATION EXTRACTION MODELS 有权

公开(公告)号：US20250111691A1

公开(公告)日：2025-04-03

申请号：US18477770

申请日：2023-09-29

Applicant: SAP SE

Inventor： Christoph Batke , Sohyeong Kim

IPC: G06V30/413 , G06V30/19

Abstract: The present disclosure relates to computer-implemented methods, software, and systems for extracting information from documents based on training techniques to generate a document foundation model that is used to initialize a document information extraction model that is fine-tuned to business document specifics. A document information extraction model is initialized based on weights provided from a first pretrained model. Fine-tuning of the document information extraction model is performed based on labeled business documents as second training data. The labeled business documents are labeled and evaluated according to a virtual adversarial training (VAT). Based on the performed fine-tuning, a classifier for classification of information extraction is generated.

4.

发明申请
QUERY IMAGE SYNTHESIS FOR IMAGE RETRIEVAL SYSTEMS BASED ON DATA AUGMENTATION TECHNIQUES 有权

公开(公告)号：US20220019849A1

公开(公告)日：2022-01-20

申请号：US16932130

申请日：2020-07-17

Applicant: SAP SE

Inventor： Sohyeong Kim , Ying Jiang , Cordula Guder

IPC: G06K9/62 , G06K9/00 , G06N20/00 , G06T7/90 , G06F16/535

Abstract: Methods, systems, and articles of manufacture, including computer program products, are provided for synthesizing images for machine learning. The method may include selecting one or more image preprocessing transformations to apply on the foreground object image; applying the selected one or more image preprocessing transformations to the foreground object image; selecting a background image from a set of background images depicting a variety of different backgrounds which may be associated with the foreground object image; merging the selected background image with the foreground object image to form a synthesized image; selecting one or more image transformations to apply on the synthesized image; applying the selected one or more image transformations to the synthesized image; and storing the synthesized image in a collection of synthesized images to train a machine learning model.

5.

发明申请
BI-DIRECTIONAL CONTEXTUALIZED TEXT DESCRIPTION 审中-公开

公开(公告)号：US20200258498A1

公开(公告)日：2020-08-13

申请号：US16270328

申请日：2019-02-07

Applicant: SAP SE

Inventor： Christian Reisswig , Darko Velkoski , Sohyeong Kim , Hung Tu Dinh , Faisal El Hussein

IPC: G10L15/06 , G10L15/22 , G10L15/183

Abstract: Various examples described herein are directed to systems and methods for analyzing text. A computing device may train an autoencoder language model using a plurality of language model training samples. The autoencoder language mode may comprise a first convolutional layer. Also, a first language model training sample of the plurality of language model training samples may comprise a first set of ordered strings comprising a masked string, a first string preceding the masked string in the first set of ordered strings, and a second string after the masked string in the first set of ordered strings. The computing device may generate a first feature vector using an input sample and the autoencoder language model. The computing device may also generate a descriptor of the input sample using a target model, the input sample, and the first feature vector.

6.

发明授权
Contextualized text description 有权

公开(公告)号：US11003861B2

公开(公告)日：2021-05-11

申请号：US16275025

申请日：2019-02-13

Applicant: SAP SE

Inventor： Christian Reisswig , Darko Velkoski , Sohyeong Kim , Hung Tu Dinh

IPC: G06F40/00 , G06F40/30 , G06F40/211 , G06F40/284

Abstract: Various examples are directed to systems and methods for classifying text. A computing device may access, from a database, an input sample comprising a first set of ordered words. The computing device may generate a first language model feature vector for the input sample using a word level language model and a second language model feature vector for the input sample using a partial word level language model. The computing device may generate a descriptor of the input sample using a target model, the input sample, the first language model feature vector, and the second language model feature vector and write the descriptor of the input sample to the database.

7.

发明授权
Robust key value extraction 有权

公开(公告)号：US10824808B2

公开(公告)日：2020-11-03

申请号：US16196153

申请日：2018-11-20

Applicant: SAP SE

Inventor： Christian Reisswig , Eduardo Vellasques , Sohyeong Kim , Darko Velkoski , Hung Tu Dinh

IPC: G10L15/02 , G06F40/295 , G06F40/289 , G06N3/04 , G06N3/08 , G06F40/30

Abstract: Disclosed herein are system, method, and computer program product embodiments for robust key value extraction. In an embodiment, one or more hierarchical concepts units (HCUs) may be configured to extract key value and hierarchical information from text inputs. The HCUs may use a convolutional neural network, a recurrent neural network, and feature selectors to analyze the text inputs using machine learning techniques to extract the key value and hierarchical information. Multiple HCUs may be used together and configured to identify different categories of hierarchical information. While multiple HCUs may be used, each may use a skip connection to transmit extracted information to a feature concatenation layer. This allows an HCU to directly send a concept that has been identified as important to the feature concatenation layer and bypass other HCUs.

8.

发明申请
IMAGE SEARCH AND TRAINING SYSTEM 审中-公开

公开(公告)号：US20200302229A1

公开(公告)日：2020-09-24

申请号：US16357720

申请日：2019-03-19

Applicant: SAP SE

Inventor： Sohyeong Kim , Eduardo Vellasques , Urko Sanchez Sanz

IPC: G06K9/62 , G06F16/532 , G06F16/56 , G06N3/08

Abstract: Disclosed herein are system, method, and computer program product embodiments for providing an image search training system. An embodiment operates by determining a query image on which to train an image search system, and a positive image visually similar to the query image. A set of negative images from a plurality of negative images visually dissimilar to the query image are selected, where the selected set of negative images includes both a first negative image and a second negative image. A first similarity measure between the first negative image and the positive image, and a second similarity measure between the second negative image and the positive image are calculated. The first negative image is selected based on the first similarity measure being less than the second similarity measure. The query image, the positive image, and the first negative image are provided to the image search system for training.

9.

发明授权
Query image synthesis for image retrieval systems based on data augmentation techniques 有权

公开(公告)号：US11531837B2

公开(公告)日：2022-12-20

申请号：US16932130

申请日：2020-07-17

Applicant: SAP SE

Inventor： Sohyeong Kim , Ying Jiang , Cordula Guder

IPC: G06K9/62 , G06K9/00 , G06F16/535 , G06N20/00 , G06T7/90

Abstract: Methods, systems, and articles of manufacture, including computer program products, are provided for synthesizing images for machine learning. The method may include selecting one or more image preprocessing transformations to apply on the foreground object image; applying the selected one or more image preprocessing transformations to the foreground object image; selecting a background image from a set of background images depicting a variety of different backgrounds which may be associated with the foreground object image; merging the selected background image with the foreground object image to form a synthesized image; selecting one or more image transformations to apply on the synthesized image; applying the selected one or more image transformations to the synthesized image; and storing the synthesized image in a collection of synthesized images to train a machine learning model.

10.

发明授权
Bi-directional contextualized text description 有权

公开(公告)号：US10963645B2

公开(公告)日：2021-03-30

申请号：US16270328

申请日：2019-02-07

Applicant: SAP SE

Inventor： Christian Reisswig , Darko Velkoski , Sohyeong Kim , Hung Tu Dinh , Faisal El Hussein

IPC: G06F40/30 , G10L15/183

Abstract: Various examples described herein are directed to systems and methods for analyzing text. A computing device may train an autoencoder language model using a plurality of language model training samples. The autoencoder language mode may comprise a first convolutional layer. Also, a first language model training sample of the plurality of language model training samples may comprise a first set of ordered strings comprising a masked string, a first string preceding the masked string in the first set of ordered strings, and a second string after the masked string in the first set of ordered strings. The computing device may generate a first feature vector using an input sample and the autoencoder language model. The computing device may also generate a descriptor of the input sample using a target model, the input sample, and the first feature vector.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification