Patent search ap:("International Business Machines Corporation") AND inv:"Namit Kabra" Page 1

1.

发明申请
Dynamic Condensing of Digital Content with Insertion of Expansion Elements 有权

公开(公告)号：US20250061472A1

公开(公告)日：2025-02-20

申请号：US18234577

申请日：2023-08-16

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Sarbajit K. Rakshit , Vijay Ekambaram

IPC: G06Q30/0201 , G06N20/00

Abstract: Mechanisms are provided for rendering content in a compacted view. A machine learning computer model is trained by a machine learning process to predict a user attention score for segments of content based on features of the content and historical user attention data. The trained machine learning computer model processes new content to associate with each segment, in a plurality of segments, of the new content, a corresponding user attention score. The segments, in the plurality of segments, of the new content are ranked relative to one another based on the corresponding user attention scores of the segments. A compacted view of the new content is rendered based on the ranking of the segments. A first number of segments are rendered in the compacted view and a second number of segments are not rendered in the compacted view, and are replaced with an inserted user selectable expansion element.

2.

发明授权
Data classification 有权

公开(公告)号：US11748382B2

公开(公告)日：2023-09-05

申请号：US16876660

申请日：2020-05-18

Applicant: International Business Machines Corporation

Inventor： Yannick Saillet , Namit Kabra , Mike W. Grasselt , Krishna Kishore Bonagiri

IPC: G06F16/28 , G06F16/2457 , G06F16/22 , G06N20/00 , G06F16/248 , G06F18/214 , G06N7/01

CPC classification number: G06F16/285 , G06F16/221 , G06F16/248 , G06F16/24573 , G06F18/214 , G06N7/01 , G06N20/00

Abstract: A method provides for classifying data fields of a dataset. A classifier configured for determining confidence values for a plurality of data classes for the data fields may be applied. Using the confidence values, data class candidates may be identified. Data fields may be determined for which a plurality of data class candidates is identifiable. Using previous user-selected data class assignments, a probability may be determined for the data class candidates that the respective data class candidate is a data class to which the respective data field is to be assigned. The data fields may be classified using the probabilities to select for the data fields a data class from the data class candidates. The dataset may be provided with metadata identifying for the data fields the data classes to which the respective data fields are assigned.

3.

发明授权
Generating weights for finding duplicate records 有权

公开(公告)号：US11687491B2

公开(公告)日：2023-06-27

申请号：US16037444

申请日：2018-07-17

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Manish A. Bhide

IPC: G06F16/17 , G06F16/174 , G06F17/10 , G06F16/903 , G06F18/2411

CPC classification number: G06F16/1748 , G06F16/90335 , G06F17/10 , G06F18/2411

Abstract: Data-deduplicating includes comparing a first record of a data-store with a second record of the data-store but instead of using a static weight for a field, the present data-deduplicating dynamically assigns a first weight for the first score to generate a first weighted score, wherein the first weight is based on one or both of the first value or the second value; and assigns a second weight for the second score to generate a second weighted score. A composite score is calculated based on the first weighted score and the second weighted score; and it is determined whether or not the first record and the second record are duplicate records, based on the composite score.

4.

发明申请
INSIGHT EXPANSION IN SMART DATA RETENTION SYSTEMS 有权

公开(公告)号：US20220222265A1

公开(公告)日：2022-07-14

申请号：US17145458

申请日：2021-01-11

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Ritesh Kumar Gupta , Ron Reuben , Vijay Ekambaram , Smitkumar Narotambhai Marvaniya

IPC: G06F16/25 , G06F16/23 , G06F16/951 , G06F16/215 , G06N20/00

Abstract: A computer-implemented method applies insights from a variety of data sources to each of the data sources. The method includes identifying a set of data sources, wherein each of the data sources are associated with a domain. The method includes analyzing documentation for each of the data sources. The method further includes extracting a set of attributes for each data source, and determining a data schema associated with each data source. The method includes mapping each data schema to a common domain schema. The method also includes linking, based on the mapping and on the set of attributes for each data source, common features across each data source. The method includes generating, in response to the linking, a knowledge graph. The method further includes preparing a visual display for a set of domain insights; and forking the set of domain insights into a first data source.

5.

发明申请
MOBILE DEVICE BASED VR CONTROL 有权

公开(公告)号：US20220028168A1

公开(公告)日：2022-01-27

申请号：US16934280

申请日：2020-07-21

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Smitkumar Narotambhai Marvaniya , Yannick Saillet , Kunjavihari Madhav Kashalikar

IPC: G06T19/00 , G06K9/00 , G02B27/01 , G06F3/0481 , G06F3/0484

Abstract: Aspects of the present disclosure relate to controlling virtual reality (VR) content displayed on a VR head mounted display (HMD). Communication can be established between a computer system, a VR HMD, and a mobile device. A user input configured to control VR content displayed on a display of the VR HMD can be received on the mobile device. The VR content displayed on the VR HMD can then be controlled based on the user input received on the mobile device.

6.

发明授权
Column weight calculation for data deduplication 有权

公开(公告)号：US10452627B2

公开(公告)日：2019-10-22

申请号：US15171200

申请日：2016-06-02

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Yannick Saillet

IPC: G06F7/00 , G06F16/215 , G06F16/21 , G06F16/174

Abstract: A computer system with the capability to identify potentially duplicative records in a data set is provided. A computer may collect a data profile for the data set that provides descriptive information with regard to attributes of the data set. Based, at least in part, on the data profile, weights are determined for the attributes. As values of a data record are compared to values of the same respective attributes in other records, the overall likelihood of a match or duplicate, as indicated by the degree of similarity between values, is modified based on the determined weights associated with the respective attributes.

7.

发明申请
DATA STANDARDIZATION RULES GENERATION 审中-公开

公开(公告)号：US20190179888A1

公开(公告)日：2019-06-13

申请号：US15838463

申请日：2017-12-12

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Yannick Saillet , Martin Oberhofer , Namit Kabra

IPC: G06F17/27 , G06K9/62 , G06N5/02 , G06F17/30 , G06N5/04

Abstract: A method for generating data standardization rules includes receiving a training data set containing tokenized and tagged data values. A set of machine mining models is built using different learning algorithms for identifying tags and tag patterns using the training set. For each data value in a further data set: a tokenization is applied on the data value, resulting in a set of tokens. For each token of the set of tokens one or more tag candidates are determined using a lookup dictionary of tags and tokens and/or at least part of the set of machine mining models, resulting for each token of the set of tokens in a list of possible tags. Unique combinations of the sets of tags of the further data set having highest aggregated confidence values are provided for use as standardization rules.

8.

发明申请
EFFICIENTLY FINDING POTENTIAL DUPLICATE VALUES IN DATA 审中-公开

公开(公告)号：US20180137189A1

公开(公告)日：2018-05-17

申请号：US15349421

申请日：2016-11-11

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Yannick Saillet

IPC: G06F17/30

CPC classification number: G06F16/285 , G06F16/215 , G06F16/24553 , G06F16/24578 , G06F16/258

Abstract: A method, system and computer program product for finding groups of potential duplicates in attribute values. Each attribute value of the attribute values is converted to a respective set of bigrams. All bigrams present in the attribute values may be determined. Bigrams present in the attribute values may be represented as bits. This may result in a bitmap representing the presence of the bigrams in the attribute values. The attribute values may be grouped using bitwise operations on the bitmap, where each group includes attribute values that are determined based on pairwise bigram-based similarity scores. The pairwise bigram-based similarity score reflects the number of common bigrams between two attribute values.

9.

发明申请
COMPUTING THE NEED FOR STANDARDIZATION OF A SET OF VALUES 审中-公开

公开(公告)号：US20180137151A1

公开(公告)日：2018-05-17

申请号：US15831575

申请日：2017-12-05

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Yannick Saillet

IPC: G06F17/30

Abstract: A method, system and computer program product for determining a data standardization score for an attribute of a dataset. A data standardization score is calculated, which reflects whether data quality of attribute values would increase if a standardization rule is applied to the attribute values. Based on attribute metadata, it may be determined whether an indication to carry or not to carry out standardization is available for at least part of the attribute values of the dataset. In response to finding the indication, a respective value may be set for the data standardization score. In response to not finding the indication, a data standardization score algorithm may be run on the at least part of the attribute values of the dataset. The data standardization score value may be compared to a predefined criterion to determine whether data standardization is to be applied on the attribute.

10.

发明申请
AUTOMATED DATA DUPLICATE IDENTIFICATION 审中-公开
Title translation: 自动数据重复标识

公开(公告)号：US20160162507A1

公开(公告)日：2016-06-09

申请号：US14561927

申请日：2014-12-05

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Ritesh K. Gupta , Namit Kabra , Manish Kumar , Srinivas K. Mittapalli

IPC: G06F17/30

CPC classification number: G06F16/215

Abstract: In an approach to identifying duplicates in data, one or more computer processors receive a request from a user to identify duplicates in a data set. The one or more computer processors retrieve the data set utilizing data discovery. The one or more computer processors perform data profiling on the data set. The one or more computer processors determine one or more domain types of the data set, based, at least in part, on the performed data profiling. The one or more computer processors perform data standardization on the data set, based, at least in part, on the one or more determined domain types. Responsive to performing data standardization, the one or more computer processors perform probabilistic matching on the data set. The one or more computer processors to identify two or more duplicates in the data set, based, at least in part, on the probabilistic matching.

Abstract translation: 在识别数据中的重复的方法中，一个或多个计算机处理器从用户接收请求以识别数据集中的重复。一个或多个计算机处理器利用数据发现来检索数据集。一个或多个计算机处理器对数据集进行数据分析。所述一个或多个计算机处理器至少部分地基于所执行的数据分析来确定所述数据集的一个或多个域类型。一个或多个计算机处理器至少部分地基于一个或多个确定的域类型来对数据集执行数据标准化。响应于执行数据标准化，一个或多个计算机处理器对数据集执行概率匹配。所述一个或多个计算机处理器至少部分地基于概率匹配来识别所述数据集中的两个或更多个重复项。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification