Adaptive recognition of entities
    11.
    发明授权

    公开(公告)号:US11755680B2

    公开(公告)日:2023-09-12

    申请号:US16749234

    申请日:2020-01-22

    Abstract: A system receives a record which includes a string and separates the string into a number of tokens, including a token and another token. The system identifies a pattern that includes an entity, another entity, and a number of entities that equals the number of tokens, and another pattern that includes the same number of entities as the number of tokens. The system determines a combined probability that combines a probability based on the number of entries in the entity's dictionary which stores the token, and another probability based on a number of character types in the other entity that match characters in the other token. If the combined probability associated with the pattern is greater than another combined probability associated with the other pattern, the system matches the record to a system record based on recognizing the token as the entity and the other token as the other entity.

    MULTI-SCALE UNSUPERVISED ANOMALY TRANSFORM FOR TIME SERIES DATA

    公开(公告)号:US20220121983A1

    公开(公告)日:2022-04-21

    申请号:US17074928

    申请日:2020-10-20

    Abstract: System receives input value in time series and determines first difference between input value at input time, and first value in time series at input time minus first lag. System determines first score based on first difference and both first average and first dispersion for first lag and time series values. System determines second difference between input value at input time, and second value in timeseries at input time minus second lag. System determines second score based on second difference and both second average and second dispersion for second lag and time series values. System transforms first and second scores into normalized anomaly score in normalized anomaly score time series. Time series database system stores normalized anomaly score time series and input value's time series into time series database. If normalized anomaly score satisfies threshold, system outputs alert including normalized anomaly score and input value retrieved from time series database.

    Trie-based normalization of field values for matching

    公开(公告)号:US11016959B2

    公开(公告)日:2021-05-25

    申请号:US15884732

    申请日:2018-01-31

    Abstract: A system tokenizes values stored in a field by multiple records. The system creates a trie from the tokenized values, each branch in the trie labeled with one of the tokenized values, each node storing a count indicating the number of the multiple records associated with a tokenized value sequence beginning from a root of the trie. The system tokenizes a value stored in the field by a prospective record. Beginning from the root of the trie, the system identifies each node corresponding to a token value sequence for the prospective record's tokenized value. Beginning from the most recently identified node for the prospective record's token value sequence, the system identifies each extending node which stores a count that satisfies a threshold, each identified extending node corresponding to another token value sequence. The system uses the other token value sequence to identify one of the multiple records that matches the prospective record.

    MACHINE-LEARNT FIELD-SPECIFIC TOKENIZATION

    公开(公告)号:US20210034596A1

    公开(公告)日:2021-02-04

    申请号:US16525945

    申请日:2019-07-30

    Abstract: A training set is created via creating adjacent classified substrings by using character classes to replace corresponding characters in adjacent substrings in each training character string, and associating each pair of adjacent classified substrings and each pair of adjacent substrings with corresponding labels indicating whether corresponding pairs include any token boundary. The system splits input character string into beginning and ending parts and creates classified beginning part by replacing beginning part character with corresponding class and classified ending part by replacing ending part character with corresponding class. The machine-learning model determines probability of token identification, based on training set to determine count of instances that classified beginning part is paired with classified ending part and count of corresponding labels that indicate inclusion of any token boundary. If token identification probability satisfies threshold, the system identifies beginning part as token and ending part as remainder of input character string.

    Optimized subset processing for de-duplication

    公开(公告)号:US10901996B2

    公开(公告)日:2021-01-26

    申请号:US15052556

    申请日:2016-02-24

    Abstract: Some embodiments of the present invention include a method for identifying duplicate records from a group of records in a database system. The method includes generating a cluster of records from a group of records based on one or more keys; splitting the cluster of records into multiple subsets of records with each subset of records having fewer number of records than the cluster of records, wherein the splitting the cluster of records into multiple subsets of records is based on a number of records in the cluster of records exceeding a threshold; causing duplicate sets of records in each of the subsets of records to be identified, wherein a duplicate set of records includes one or more records, and wherein when a duplicate set of records includes two or more records, the two or more records are duplicates of one another; merging all of the duplicate sets of records identified from the multiple subsets of records forming a first group of duplicate sets of records; and forming a representative set of records based on selecting a representative record from each of the duplicate sets in the first group of duplicate sets of records.

    Evaluating personalized recommendation models

    公开(公告)号:US10572820B2

    公开(公告)日:2020-02-25

    申请号:US14843078

    申请日:2015-09-02

    Abstract: A personalized recommendation model scores each object in an interaction set of objects with which a user interacted and in a ransom set of objects with which the user lacks known interaction. A system sorts each scored object based on a decreasing order of each corresponding score, and identifies a high scoring set of the sorted objects with a number (equal to the number of objects in the interaction set of objects) of highest corresponding scores. The system aggregates a corresponding order value for each object in the high scoring set that is also in the interaction set of objects (the corresponding order value for an object is based on a corresponding order for the object in the high scoring set). The system evaluates the model for the user by dividing the aggregated order value by an aggregation of a corresponding order value for each object in the high scoring set.

    System and method for using a statistical classifier to score contact entities

    公开(公告)号:US09646246B2

    公开(公告)日:2017-05-09

    申请号:US13773141

    申请日:2013-02-21

    CPC classification number: G06N5/02 G06F17/30985 G06N7/005 G06Q30/02

    Abstract: A system and method for associating a character string with one or more defined entities of a contact record. An input character string is received. The string is first evaluated to see if the structure of the string is recognized. If not, then the string is compared to entries in a look up table. If the string format is not recognized, and the string is not found in the look up table, then a posterior probability is calculated for a set of defined entities over a limited set of string processing features. The result of probabilistic scoring determines which of the defined entities to associate with the character string.

    COMBINED DIRECTED GRAPHS
    18.
    发明申请

    公开(公告)号:US20170091229A1

    公开(公告)日:2017-03-30

    申请号:US14867154

    申请日:2015-09-28

    Abstract: A combined directed graph is created having a corresponding node for each node in a first directed graph lacking a corresponding node in a second directed graph, each node in the second graph lacking a corresponding node in the first graph, and each node in the first graph having a corresponding node in the second graph. A corresponding directed arc is created in the combined directed graph for each arc in the first graph lacking a corresponding arc in the second directed graph, each arc in the second graph lacking a corresponding arc in the first graph, and each arc in the first graph having a corresponding arc in the second graph. A recommendation is output for a user to interact with a recommended object based on an object interaction and a conditional probability, in the combined graph, which corresponds to the recommended object and the object interaction.

    EVALUATING PERSONALIZED RECOMMENDATION MODELS
    19.
    发明申请
    EVALUATING PERSONALIZED RECOMMENDATION MODELS 审中-公开
    评估个性化推荐模型

    公开(公告)号:US20170061325A1

    公开(公告)日:2017-03-02

    申请号:US14843078

    申请日:2015-09-02

    CPC classification number: G06N20/00 G06F16/337 G06F16/9535

    Abstract: A personalized recommendation model scores each object in an interaction set of objects with which a user interacted and in a ransom set of objects with which the user lacks known interaction. A system sorts each scored object based on a decreasing order of each corresponding score, and identifies a high scoring set of the sorted objects with a number (equal to the number of objects in the interaction set of objects) of highest corresponding scores. The system aggregates a corresponding order value for each object in the high scoring set that is also in the interaction set of objects (the corresponding order value for an object is based on a corresponding order for the object in the high scoring set). The system evaluates the model for the user by dividing the aggregated order value by an aggregation of a corresponding order value for each object in the high scoring set.

    Abstract translation: 个性化推荐模型对用户与之交互的对象的交互集合中的每个对象以及用户缺少已知交互的对象的赎金集合中的每个对象进行评分。 系统基于每个对应分数的递减顺序对每个评分对象进行排序,并用最高对应分数的数目(等于对象的交互集合中的对象的数量)来识别排序对象的高评分集。 系统聚合高分数集中每个对象的相应订单值,这些对象也在对象的交互集合中(对象的相应订单值基于高分数集中的对象的相应订单)。 系统通过将聚合顺序值除以高分数集中每个对象的相应订单值的聚合来评估用户的模型。

    ACCOUNT RECOMMENDATIONS FOR USER ACCOUNT SETS
    20.
    发明申请
    ACCOUNT RECOMMENDATIONS FOR USER ACCOUNT SETS 审中-公开
    用户帐户集的帐户建议

    公开(公告)号:US20160379265A1

    公开(公告)日:2016-12-29

    申请号:US14750551

    申请日:2015-06-25

    CPC classification number: G06Q30/0269

    Abstract: New account recommendations for user account sets are described. A system creates an accounts profile for a set of accounts based on multiple attributes associated with each account of the set of accounts. The system calculates an account score for an account based on comparing multiple attributes associated with the account against the accounts profile, wherein the account is not in the set of accounts. The system determines whether the account score satisfies an account score threshold. The system recommends the account to a user associated with the set of accounts if the account score satisfies the account score threshold.

    Abstract translation: 描述用户帐户集的新帐户建议。 系统根据与该组帐户中的每个帐户相关联的多个属性为一组帐户创建帐户配置文件。 该系统基于将与帐户相关联的多个属性与帐户简档进行比较来计算帐户的帐户分数,其中该帐户不在该组帐户中。 系统确定帐户分数是否满足帐户分数阈值。 如果帐户分数满足帐户分数阈值,系统会向与该组帐户相关联的用户建议该帐户。

Patent Agency Ranking