Organizational data enrichment
    2.
    发明授权

    公开(公告)号:US10242258B2

    公开(公告)日:2019-03-26

    申请号:US14929104

    申请日:2015-10-30

    摘要: In an example embodiment, a fuzzy join operation is performed by, for each pair of records, evaluating a first plurality of features for both records in the pair of records by calculating term frequency-inverse term frequency (TF-IDF) for each token of each field relevant to each feature and based on the calculated TF-IDF for each token of each field relevant to each feature, computing a similarity score based on the similarity function by adding a weight assigned to the TF-IDF for any token that appears in both records. Then a graph data structure is created, having a node for each record in the plurality of records and edges between each of the nodes, except, for each record pair having a similarity score that does not transgress a first threshold, causing no edge between the nodes for the record pair to appear in the graph data structure.

    Matching entities across multiple data sources

    公开(公告)号:US10372720B2

    公开(公告)日:2019-08-06

    申请号:US15339703

    申请日:2016-10-31

    摘要: Techniques for performing a fuzzy match of data from multiple sources are provided. In one technique, an email address of a sender of an email message is extracted from the email message. The email address is used to retrieve, from a first data source, first entity data about one or more entities, such as users. The first entity data is used to retrieve, from a second data source, second entity data about one or more entities. First data that pertains to the sender and that originates from the first data source is combined with second data that pertains to the sender and that originates from the second data source to generate sender data. The sender data is then presented via an email client that displays the email message.