Patent search ap:("salesforce.com Page inc.") AND inv:"Arun Kumar Jagota"

1.

发明授权
Adaptive match indexes 有权

公开(公告)号：US11372928B2

公开(公告)日：2022-06-28

申请号：US16775611

申请日：2020-01-29

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota , Ajitesh Jain , Rahul Mathias Madan , Shravani Madhavaram

IPC: G06F16/90 , G06N20/00 , G06F16/903 , G06F16/901

Abstract: Determine first count of first records storing first value in first field, second count of second records storing second value in second field, third count of third records storing third value in third field. Determine count threshold using first, second and third counts, dispersion measure based on dispersion of values stored in second field by first records and other dispersion measure based on other dispersion of values stored in third field by first records. Train machine-learning model to determine dispersion measure threshold based on dispersion and other dispersion measures. If first count is greater than count threshold, and dispersion measure is greater than dispersion measure threshold, create match index based on first and second fields. Receive prospective record storing first value in first field, second value in second field. Use match index to identify record storing first value in first field, second value in second field as matching prospective record.

2.

发明申请
ADAPTIVE FIELD-LEVEL MATCHING 有权

公开(公告)号：US20210342353A1

公开(公告)日：2021-11-04

申请号：US16862667

申请日：2020-04-30

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota , Ajitesh Jain , Rahul Mathias Madan , Shravani Madhavaram

IPC: G06F16/2455 , G06N20/00

Abstract: Adaptive field-level matching is described. A system identifies first elements in a field of a prospective record for a database, and second elements in the field of a candidate record, in the database, for matching the prospective record. The system identifies features corresponding to any of the first elements that are identical to any of the second elements, any of the first elements that are absent from the second elements, and any of the second elements that are absent from the first elements. A machine-learning model uses the features to determine a field match score for the candidate record's field. Another machine-learning model weighs the field match score and weighs another field match score for another field of the candidate record to determine a record match score for the candidate record. If the record match score satisfies a threshold, the system identifies the candidate record as matching the prospective record.

3.

发明授权
Machine-learnt field-specific tokenization 有权

公开(公告)号：US11163740B2

公开(公告)日：2021-11-02

申请号：US16525945

申请日：2019-07-30

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota

IPC: G06F16/30 , G06F16/22 , G06F16/28 , G06N20/00

Abstract: A training set is created via creating adjacent classified substrings by using character classes to replace corresponding characters in adjacent substrings in each training character string, and associating each pair of adjacent classified substrings and each pair of adjacent substrings with corresponding labels indicating whether corresponding pairs include any token boundary. The system splits input character string into beginning and ending parts and creates classified beginning part by replacing beginning part character with corresponding class and classified ending part by replacing ending part character with corresponding class. The machine-learning model determines probability of token identification, based on training set to determine count of instances that classified beginning part is paired with classified ending part and count of corresponding labels that indicate inclusion of any token boundary. If token identification probability satisfies threshold, the system identifies beginning part as token and ending part as remainder of input character string.

4.

发明申请
MACHINE-LEARNT FIELD-SPECIFIC STANDARDIZATION 有权

公开(公告)号：US20210034638A1

公开(公告)日：2021-02-04

申请号：US16528175

申请日：2019-07-31

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota , Stanislav Georgiev

IPC: G06F16/25 , G06N20/00 , G06N7/00 , G06F16/2455

Abstract: A system tokenizes raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences. A machine-learning model learns standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences. The system tokenizes an input value into an input token sequence. The machine-learning model determines a probability of inserting an insertion token after an insertion markable token in the input token sequence. If the probability of inserting the insertion token satisfies a threshold, the system inserts the insertion token after the insertion markable token in the input token sequence. The machine-learning model determines a probability of substituting a substitution token for a substitutable token in the input token sequence. If the probability of substituting the substitution token satisfies another threshold, the system substitutes the substitution token for the substitutable token in the input token sequence.

5.

发明授权
Recommending data providers' datasets based on database value densities 有权

公开(公告)号：US10817479B2

公开(公告)日：2020-10-27

申请号：US15631306

申请日：2017-06-23

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota , Marc Joseph Delurgio , Venkata Murali Tejomurtula

IPC: G06F16/185 , G06F16/21 , G06F16/31 , G06F16/2458 , G06F17/18

Abstract: Recommending data providers' datasets based on database value densities is described. A database system determines a provider dataset density for a value by identifying a frequency of the value in a dataset that is provided by a data provider. The database system determines a user database density for the value by identifying a frequency of the value in a database used by a data user. The database system determines a relative density based on a relationship between the provider dataset density and the user database density. The database system determines an evaluation metric for the value, based on a combination of the relative density and the user database density. The database system causes a recommendation to be outputted, based on a relationship of the evaluation metric relative to other evaluation metrics for other values, which recommends that the data user acquire at least a part of the dataset.

6.

发明授权
Match index creation 有权

公开(公告)号：US10817465B2

公开(公告)日：2020-10-27

申请号：US15496905

申请日：2017-04-25

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota , Dmytro Kudriavtsev

IPC: G06F7/00 , G06F17/00 , G06F16/11 , G06F7/02 , G06F16/16 , G06F16/24 , G06F16/23 , G06F16/215 , G06Q10/10 , G06Q30/02

Abstract: A system identifies a first number of distinct values stored in a first field by a dataset of records. The system identifies a second number of distinct values stored in a second field by the dataset of records. The system creates a trie from values stored in a field by multiple records, the field corresponding to the first field or the second field, based on comparing the first number to the second number. The system associates a node in the trie with one of the multiple records, based on a value stored in the field by the record. The system identifies a branch sequence in the trie as a key for a prospective record, based on a prospective value stored in a corresponding field by the prospective record. The system uses the key for the prospective record to identify one of the multiple records that matches the prospective record.

7.

发明授权
Rule set induction 有权

公开(公告)号：US10552744B2

公开(公告)日：2020-02-04

申请号：US15368173

申请日：2016-12-02

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota , Cem Gurkok

IPC: G06N5/00 , G06N5/02

Abstract: System receives inputs, each input associated with a label and having features, creates a rule for each feature, each rule including a feature and a label, each rule stored in a hierarchy, and distributes each rule into a partition associated with a label or another partition associated with another label. System identifies a number of inputs that include a feature for a rule in the rule partition, and identifies another number of inputs that include both the feature for the rule and another feature for another rule in the rule partition. System deletes the rule from the hierarchy if the ratio of the other number of inputs to the number of inputs satisfies a threshold and an additional number of inputs that includes the other antecedent feature is at least as much as the number. System predicts a label for an input including features by applying each remaining rule to the input.

8.

发明申请
SEARCH QUERY RESULT SET COUNT ESTIMATION 审中-公开

公开(公告)号：US20190236475A1

公开(公告)日：2019-08-01

申请号：US15882800

申请日：2018-01-29

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota , Kevin Han

IPC: G06N7/00 , G06F17/30 , G06N99/00

CPC classification number: G06N7/005 , G06F16/24537 , G06F16/9024 , G06F16/951 , G06N20/00

Abstract: Search query result set count estimation is described. A system parses data set query that includes first query attribute and second query attribute. The system identifies first hierarchy of connected nodes including a first node representing a first query attribute, and a second hierarchy of other connected nodes including a second node representing a second query attribute. The system identifies a directed arc connecting first correlated node in first hierarchy to second correlated node in second hierarchy. The system identifies cross-hierarchy probabilities of correlations between values of a first attribute represented by the first correlated node and values of a second attribute represented by the second correlated node. The system outputs query result set estimated count generated from cross-hierarchy probabilities, probabilities that values of first attribute are associated with values corresponding to first node, and probabilities that values of second attribute are associated with values corresponding to second node.

9.

发明授权
System and method for mapping source columns to target columns 有权
Title translation: 将源列映射到目标列的系统和方法

公开(公告)号：US08972336B2

公开(公告)日：2015-03-03

申请号：US13773286

申请日：2013-02-21

Applicant: salesforce.com, inc.

Inventor： Arun Kumar Jagota

IPC: G06F17/30 , G06F17/27 , G06Q30/02 , G06N99/00

CPC classification number: G06F17/3007 , G06F17/2715 , G06F17/30569 , G06F17/30985 , G06N99/005 , G06Q30/02

Abstract: A system and method for mapping columns from a source file to a target file. The header for each source column is evaluated heuristically to see if the header matches a predefined entity. The contents of a group of cells in the source column are evaluated probabilistically to determine a probability that the cell contents correspond to at least one of the predefined entities. A score is assigned to the likelihood that the column corresponds to one or more predefined entities. If the score meets a threshold, then the correspondence between the source column and one or more predefined entities is mapped. If the score fails to meets the threshold, then the correspondence between the source column and one or more undefined entities is mapped. Finally, each source column is transformed into a target column in accord with the map.

Abstract translation: 用于将列从源文件映射到目标文件的系统和方法。对每个源列的标题进行启发性评估，以查看标题是否与预定义的实体匹配。概率地评估源列中的一组单元的内容以确定单元内容对应于至少一个预定实体的概率。分数分配给列对应于一个或多个预定义实体的可能性。如果分数满足阈值，则映射源列与一个或多个预定义实体之间的对应关系。如果分数不符合阈值，则源列与一个或多个未定义实体之间的对应关系被映射。最后，根据地图将每个源列转换为目标列。

10.

发明授权
Determining rationale for a prediction of a machine learning based model 有权

公开(公告)号：US11790278B2

公开(公告)日：2023-10-17

申请号：US16778925

申请日：2020-01-31

Applicant: salesforce.com, inc.

Inventor： Rakesh Ganapathi Karanth , Arun Kumar Jagota , Kaushal Bansal , Amrita Dasgupta

IPC: G06N20/20 , G06N5/045 , G06N20/00 , G06F18/243 , G06F18/2134

CPC classification number: G06N20/20 , G06F18/2134 , G06F18/24323 , G06N5/045 , G06N20/00

Abstract: An online system performs predictions for real-time tasks and near real-time tasks that need to be performed by a deadline. A client device receives a real-time machine learning based model associated with a measure of accuracy. If the client device determines that a task can be performed using predictions having less than the specified measure of accuracy, the client device uses the real-time machine learning based model. If the client device determines that a higher level of accuracy of results is required, the client device sends a request to an online system. The online system provides a prediction along with a string representing a rationale for the prediction.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification