-
公开(公告)号:US10769426B2
公开(公告)日:2020-09-08
申请号:US14929128
申请日:2015-10-30
发明人: Songtao Guo , Ke Wang , Alex Ching Lai , Aarti Kumar , Keith Wai Kit Tsang , Rekha Thakur , Song Lin , Christopher Matthew Degiere
IPC分类号: G06Q10/06 , G06Q30/02 , G06K9/00 , G06K9/62 , G06Q10/10 , G06N20/00 , G06F16/58 , G06F16/28 , G06F16/951 , G06F16/583 , G06F16/901 , G06F16/2453 , G06K9/46 , G06F16/215 , G06N5/00 , G06N20/20 , G06N7/00 , G06N7/02 , G06Q50/00 , H04L29/08
摘要: In an example embodiment, a member profile corresponding to a member of a social networking service is obtained. Usage information for the member is then obtained, and one or more member metrics are calculated based on the member profile and usage information for the corresponding member. A plurality of features are extracted from the member profile and the one or more member metrics. The plurality of features is inserted into an organization name confidence score model to obtain a confidence score for an organization name in the member profile.
-
公开(公告)号:US10242258B2
公开(公告)日:2019-03-26
申请号:US14929104
申请日:2015-10-30
IPC分类号: G06K9/46 , G06K9/00 , G06K9/62 , G06F17/30 , G06Q10/10 , G06N99/00 , G06N7/02 , G06Q10/06 , G06Q50/00 , H04L29/08
摘要: In an example embodiment, a fuzzy join operation is performed by, for each pair of records, evaluating a first plurality of features for both records in the pair of records by calculating term frequency-inverse term frequency (TF-IDF) for each token of each field relevant to each feature and based on the calculated TF-IDF for each token of each field relevant to each feature, computing a similarity score based on the similarity function by adding a weight assigned to the TF-IDF for any token that appears in both records. Then a graph data structure is created, having a node for each record in the plurality of records and edges between each of the nodes, except, for each record pair having a similarity score that does not transgress a first threshold, causing no edge between the nodes for the record pair to appear in the graph data structure.
-
公开(公告)号:US10372720B2
公开(公告)日:2019-08-06
申请号:US15339703
申请日:2016-10-31
发明人: Chris Degiere , Aarti Kumar , Xian Li , Alexander Power , Derek Ribbons
IPC分类号: G06F17/30 , G06F16/2458 , G06F16/248 , G06F16/2457 , H04L12/58
摘要: Techniques for performing a fuzzy match of data from multiple sources are provided. In one technique, an email address of a sender of an email message is extracted from the email message. The email address is used to retrieve, from a first data source, first entity data about one or more entities, such as users. The first entity data is used to retrieve, from a second data source, second entity data about one or more entities. First data that pertains to the sender and that originates from the first data source is combined with second data that pertains to the sender and that originates from the second data source to generate sender data. The sender data is then presented via an email client that displays the email message.
-
公开(公告)号:US10282606B2
公开(公告)日:2019-05-07
申请号:US15937051
申请日:2018-03-27
发明人: Songtao Guo , Christopher Matthew Degiere , Jingjing Huang , Aarti Kumar , Alex Ching Lai , Xian Li
IPC分类号: G06K9/00 , H04L29/08 , G06F17/30 , G06Q50/00 , G06Q10/10 , G06Q10/06 , G06N99/00 , G06N7/02 , G06K9/62 , G06K9/46
摘要: In an example embodiment, a web page is obtained using a web page address stored in a first record and is parsed to extract one or more images from the web page along with a first plurality of features for each of the one or more images from the web page. Information about each image of the web page and the extracted first plurality of features for the web page are input into a supervised machine learning classifier to calculate a logo confidence score for each image of the web page, the logo confidence score indicating the probability that the image is an organization logo. In response to a particular image in the web page having a logo confidence score transgressing a first threshold, the particular image is injected into an organization logo field of the first record.
-
公开(公告)号:US20180218207A1
公开(公告)日:2018-08-02
申请号:US15937051
申请日:2018-03-27
发明人: Songtao Guo , Christopher Matthew Degiere , Jingjing Huang , Aarti Kumar , Alex Ching Lai , Xian Li
IPC分类号: G06K9/00 , H04L29/08 , G06F17/30 , G06Q50/00 , G06Q10/10 , G06Q10/06 , G06N99/00 , G06N7/02 , G06K9/62
CPC分类号: G06K9/00456 , G06F17/30256 , G06F17/30259 , G06F17/30265 , G06F17/30448 , G06F17/30466 , G06F17/30598 , G06F17/30864 , G06F17/30958 , G06K9/00469 , G06K9/46 , G06K9/6215 , G06K9/6256 , G06K9/6263 , G06K9/6276 , G06K2209/25 , G06N7/02 , G06N99/005 , G06Q10/06393 , G06Q10/10 , G06Q50/01 , H04L67/10 , H04L67/306
摘要: In an example embodiment, a web page is obtained using a web page address stored in a first record and is parsed to extract one or more images from the web page along with a first plurality of features for each of the one or more images from the web page. Information about each image of the web page and the extracted first plurality of features for the web page are input into a supervised machine learning classifier to calculate a logo confidence score for each image of the web page, the logo confidence score indicating the probability that the image is an organization logo. In response to a particular image in the web page having a logo confidence score transgressing a first threshold, the particular image is injected into an organization logo field of the first record.
-
公开(公告)号:US10002292B2
公开(公告)日:2018-06-19
申请号:US14929116
申请日:2015-10-30
发明人: Songtao Guo , Christopher Matthew Degiere , Jingjing Huang , Aarti Kumar , Alex Ching Lai , Xian Li
IPC分类号: G06K9/00 , G06N99/00 , G06F17/30 , G06Q10/10 , G06F17/27 , G06K9/62 , G06N7/02 , G06Q10/06 , G06Q50/00 , H04L29/08
CPC分类号: G06K9/00456 , G06F16/215 , G06F16/24534 , G06F16/24544 , G06F16/285 , G06F16/58 , G06F16/5838 , G06F16/5854 , G06F16/9024 , G06F16/951 , G06K9/00469 , G06K9/46 , G06K9/6215 , G06K9/6256 , G06K9/6263 , G06K9/6276 , G06K2209/25 , G06N5/003 , G06N7/005 , G06N7/02 , G06N20/00 , G06N20/20 , G06Q10/06393 , G06Q10/10 , G06Q50/01 , H04L67/10 , H04L67/306 , Y04S10/54
摘要: In an example embodiment, a web page is obtained using a web page address stored in a first record and is parsed to extract one or more images from the web page along with a second plurality of features for each of the one or more images from the web page. Information about each image of the web page and the extracted second plurality of features for the web page are input into a supervised machine learning classifier to calculate a logo confidence score for each image of the web page, the logo confidence score indicating the probability that the image is an organization logo. In response to a particular image in the web page having a logo confidence score transgressing a first threshold, the particular image is injected into an organization logo field of the first record.
-
-
-
-
-