专利检索 ap:("International Business Machines Corporation") AND inv:"Esther Goldbraich" 第 1 页

1.

发明申请
LANGUAGE-MODEL-BASED DATA AUGMENTATION METHOD FOR TEXTUAL CLASSIFICATION TASKS WITH LITTLE DATA 有权

公开(公告)号：US20210350076A1

公开(公告)日：2021-11-11

申请号：US16870917

申请日：2020-05-09

申请人： International Business Machines Corporation

发明人： Amir Kantor , Ateret Anaby Tavor , Boaz Carmeli , Esther Goldbraich , GEORGE KOUR , Segev Shlomov , Naama Tepper , Naama Zwerdling

IPC分类号： G06F40/279 , G06N20/00 , G06N5/04

摘要： Embodiments of the present systems and methods may provide techniques for augmenting textual data that may be used for textual classification tasks. Embodiments of such techniques may provide the capability to synthesize labeled data to improve text classification tasks. Embodiments may be specifically useful when only a small amount of data is available, and provide improved performance in such cases. For example, in an embodiment, a method implemented in a computer system may comprise a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, and the method may comprise fine-tuning a language model using a training dataset, synthesizing a plurality of samples using the fine-tuned language model, filtering the plurality of synthesized samples, and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences.

2.

发明申请
GENERATING PREDICTED REACTIONS OF A USER 审中-公开

公开(公告)号：US20190146636A1

公开(公告)日：2019-05-16

申请号：US15811714

申请日：2017-11-14

申请人： International Business Machines Corporation

发明人： Shiri Kremer-Davidson , Anat Hashavit , Esther Goldbraich , Maya Barnea , Oren Sar-Shalom

IPC分类号： G06F3/0482 , G06N7/00 , G06F3/0484

摘要： The present invention provides a method, computer program product, and system of generating prioritized list. In an embodiment, the method, computer program product, and system include receiving, by a computer system, target user identification data identifying a target user, target action data, social network content for the one or more users, and social network activity data for the one or more users, analyzing, by a computer system, social network links between the source user and the target user and the social network activity data for the one or more users, determining, by a computer system, a prioritized list of probabilistic action paths that could result in the target user performing the target action on the content based on the analyzing, and outputting the prioritized list to the source user.

3.

发明授权
Techniques for generating a topic model 有权

公开(公告)号：US11914966B2

公开(公告)日：2024-02-27

申请号：US16445256

申请日：2019-06-19

申请人： International Business Machines Corporation

发明人： Esther Goldbraich

IPC分类号： G06F40/40 , G06N20/00 , G06F16/28 , H04L51/52

CPC分类号： G06F40/40 , G06F16/285 , G06N20/00 , H04L51/52

摘要： In some examples, a system for generating a topic model includes a processor that can process a set of documents to generate training data, wherein each document in the set of documents is associated with one or more users. The processor can also generate a plurality of topic models using the training data, such that each topic model includes a different number of topics. The processor can also generate an evaluation score for each of the topic models based on information about the users associated with the documents included in the training data. The evaluation score describes a percentage of topics that exhibit a specified level of interest from a specified number of users. The processor can also identify a final topic model based on the evaluation scores and store the final topic model to be used in natural language processing.

4.

发明授权
Method and system for generating a prioritized list 有权

公开(公告)号：US11188193B2

公开(公告)日：2021-11-30

申请号：US15811714

申请日：2017-11-14

申请人： International Business Machines Corporation

发明人： Shiri Kremer-Davidson , Anat Hashavit , Esther Goldbraich , Maya Barnea , Oren Sar-Shalom

IPC分类号： G06F3/0482 , G06N7/00 , G06F3/0484 , G06Q10/06 , G06Q50/00 , H04L29/08 , H04L12/26 , H04L12/24 , G06Q40/00

摘要： The present invention provides a method, computer program product, and system of generating prioritized list. In an embodiment, the method, computer program product, and system include receiving, by a computer system, target user identification data identifying a target user, target action data, social network content for the one or more users, and social network activity data for the one or more users, analyzing, by a computer system, social network links between the source user and the target user and the social network activity data for the one or more users, determining, by a computer system, a prioritized list of probabilistic action paths that could result in the target user performing the target action on the content based on the analyzing, and outputting the prioritized list to the source user.

5.

发明授权
Dataset balancing via quality-controlled sample generation 有权

公开(公告)号：US11797516B2

公开(公告)日：2023-10-24

申请号：US17317922

申请日：2021-05-12

申请人： International Business Machines Corporation

发明人： Naama Tepper , Esther Goldbraich , Boaz Carmeli , Naama Zwerdling , George Kour , Ateret Anaby Tavor

IPC分类号： G06F16/23 , G06N20/00

CPC分类号： G06F16/2365 , G06N20/00

摘要： Balancing an imbalanced dataset, by: Receiving a balancing policy and the imbalanced dataset. Performing initial adjustment of the imbalanced dataset to comply with the balancing policy, by: oversampling one or more underrepresented classes, and, if one or more of the classes are overrepresented, undersampling them. Operating a generative machine learning model to generate samples for the one or more underrepresented classes, based on the initially-adjusted dataset. Operating a machine learning classification model to label the generated samples with class labels corresponding to the one or more underrepresented classes. Selecting some of the generated samples which, according to the labeling, have a relatively high probability of preserving their class labels. Composing a balanced dataset which complies with the balancing policy and comprises: the samples belonging to the one or more underrepresented classes, the selected generated samples, and an undersampling of the samples belonging to the one or more overrepresented classes.

6.

发明授权
Language-model-based data augmentation method for textual classification tasks with little data 有权

公开(公告)号：US11526667B2

公开(公告)日：2022-12-13

申请号：US16870917

申请日：2020-05-09

申请人： International Business Machines Corporation

发明人： Amir Kantor , Ateret Anaby Tavor , Boaz Carmeli , Esther Goldbraich , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

IPC分类号： G06F40/279 , G06N5/04 , G06N20/00

摘要： Embodiments of the present systems and methods may provide techniques for augmenting textual data that may be used for textual classification tasks. Embodiments of such techniques may provide the capability to synthesize labeled data to improve text classification tasks. Embodiments may be specifically useful when only a small amount of data is available, and provide improved performance in such cases. For example, in an embodiment, a method implemented in a computer system may comprise a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, and the method may comprise fine-tuning a language model using a training dataset, synthesizing a plurality of samples using the fine-tuned language model, filtering the plurality of synthesized samples, and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences.

7.

发明申请
DATASET BALANCING VIA QUALITY-CONTROLLED SAMPLE GENERATION 有权

公开(公告)号：US20220374410A1

公开(公告)日：2022-11-24

申请号：US17317922

申请日：2021-05-12

申请人： International Business Machines Corporation

发明人： Naama Tepper , Esther Goldbraich , Boaz Carmeli , Naama Zwerdling , GEORGE KOUR , Ateret Anaby Tavor

IPC分类号： G06F16/23 , G06N20/00

摘要： Balancing an imbalanced dataset, by: Receiving a balancing policy and the imbalanced dataset. Performing initial adjustment of the imbalanced dataset to comply with the balancing policy, by: oversampling one or more underrepresented classes, and, if one or more of the classes are overrepresented, undersampling them. Operating a generative machine learning model to generate samples for the one or more underrepresented classes, based on the initially-adjusted dataset. Operating a machine learning classification model to label the generated samples with class labels corresponding to the one or more underrepresented classes. Selecting some of the generated samples which, according to the labeling, have a relatively high probability of preserving their class labels. Composing a balanced dataset which complies with the balancing policy and comprises: the samples belonging to the one or more underrepresented classes, the selected generated samples, and an undersampling of the samples belonging to the one or more overrepresented classes.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类