-
公开(公告)号:US11610109B2
公开(公告)日:2023-03-21
申请号:US16142441
申请日:2018-09-26
IPC分类号: G06N3/08 , G06Q10/06 , G06F16/28 , G06N3/04 , G06Q10/0631
摘要: In an example embodiment, a system is provided whereby a machine learning model is trained to predict a standardization for a given raw title. A neural network may be trained whose input is a raw title (such as a query string) and a list of candidate titles (either title identifications in a taxonomy, or English strings), which produces a probability that the raw title and each candidate belong to the same title. The model is able to standardize titles in any language included in the training data without first having to perform language identification or normalization of the title. Additionally, the model is able to benefit from the existence of “loan words” (words adopted from a foreign language with little or no modification) and relations between languages.
-
公开(公告)号:US11436532B2
公开(公告)日:2022-09-06
申请号:US16703386
申请日:2019-12-04
发明人: Tianhao Lu , Junzhe Miao , Yunpeng Xu , Dan Shacham , Hong H. Tam , Tao Xiong
IPC分类号: G06N20/00 , G06F16/174 , G06F16/23 , G06F16/953
摘要: The disclosed embodiments provide a system that identifies duplicate entities. During operation, the system selects training data for a first machine learning model based on confidence scores representing likelihoods that pairs of entities in an online system are duplicates. Next, the system updates parameters of the first machine learning model based on features and labels in the training data. The system then identifies a first subset of additional pairs of the entities as duplicate entities based on scores generated by the first machine learning model from values of the features for the additional pairs and a first threshold associated with the scores. The system also determines a canonical entity in each of the duplicate entities based on additional features. Finally, the system updates content outputted in a user interface of the online system based on the identified first subset of the additional pairs.
-
公开(公告)号:US20210173825A1
公开(公告)日:2021-06-10
申请号:US16703386
申请日:2019-12-04
发明人: Tianhao Lu , Junzhe Miao , Yunpeng Xu , Dan Shacham , Hong H. Tam , Tao Xiong
摘要: The disclosed embodiments provide a system that identifies duplicate entities. During operation, the system selects training data for a first machine learning model based on confidence scores representing likelihoods that pairs of entities in an online system are duplicates. Next, the system updates parameters of the first machine learning model based on features and labels in the training data. The system then identifies a first subset of additional pairs of the entities as duplicate entities based on scores generated by the first machine learning model from values of the features for the additional pairs and a first threshold associated with the scores. The system also determines a canonical entity in each of the duplicate entities based on additional features. Finally, the system updates content outputted in a user interface of the online system based on the identified first subset of the additional pairs.
-
公开(公告)号:US10339612B2
公开(公告)日:2019-07-02
申请号:US15195562
申请日:2016-06-28
发明人: Uri Merhav , Peide Zhong , Angela Jiang , Qi He , Dan Shacham
摘要: An online social networking system extracts terms from an unstructured job title record. The system searches a job role taxonomy database with the extracted terms to identify job roles. For each job role identified, the system extracts a plurality of additional terms appearing in the unstructured job title record. For each additional term, the system maps the additional term to a standardized modifier, thereby identifying a job seniority modifier, a job specialty modifier, a job accreditation modifier, and a job status modifier for each additional term. The system creates a multi-dimensional standardized job title for the member profile or job posting by writing the job role, the job seniority modifier, the job specialty modifier, the job accreditation modifier, and the job status modifier to a standardization record in a standardization database.
-
公开(公告)号:US11188823B2
公开(公告)日:2021-11-30
申请号:US15168750
申请日:2016-05-31
发明人: Uri Merhav , Dan Shacham
摘要: In an example embodiment, a first DCNN is trained to output a value for a first metric by inputting a plurality of sample documents to the first DCNN, with each of the sample documents having been labeled with a value for the first metric. Then a plurality of possible transformations of a first input document are fed to the first DCNN, obtaining a value for the first metric for each of the plurality of possible transformations. A first transformation is selected from the plurality of possible transformations based on the values for the first metric for each of the plurality of possible transformations. Then a second DCNN is trained to output a transformation for a document by inputting the selected first transformation to the second DCNN. The second input document is fed to the second DCNN, obtaining a second transformation of the second input document.
-
公开(公告)号:US20200160398A1
公开(公告)日:2020-05-21
申请号:US16192565
申请日:2018-11-15
发明人: Ruoyan Wang , Liu Yang , Dan Shacham , Gaurav Chandalia
IPC分类号: G06Q30/02
摘要: Technologies for associating an entity with a content delivery campaign are provided. Disclosed techniques include determining a first value of a profile attribute of the entity. A particular node that matches the first value is identified from a value tree of nodes. A parent node of the particular node is identified from the value tree. Child nodes of the parent node are identified, where the child nodes do not include the particular node. Values from the child nodes are then associated with the profile attribute of the entity. A particular value is received for a particular targeting criterion of the content delivery campaign. It is determined whether the particular value matches a value of the child nodes, where the particular value does not match the first value. In response to determining that the particular value matches a value of the child nodes, associating the entity with the content delivery campaign.
-
公开(公告)号:US20190205376A1
公开(公告)日:2019-07-04
申请号:US15885004
申请日:2018-01-31
发明人: Uri Merhav , Dan Shacham , Peide Zhong
IPC分类号: G06F17/27
CPC分类号: G06F17/277 , G06F17/2715 , G06F17/273 , G06F17/2795
摘要: Example methods and systems are directed to determining a standardized job title corresponding to an input job title. The input job title may be normalized according to various normalization rules to produce a normalized input job title. The normalized input job title may then be tokenized into one or more n-grams, and synonyms may be identified from the various n-grams. A title taxonomy may then be searched using the normalized input job title, the tokenized n-grams, and the identified synonyms, where the search results correspond to standardized job titles that match the various inputs. Each of the candidate job titles may then be scored using congruence type features and information quality features. The highest scoring candidate job title is then selected as the standardized job title for the input job title. An association is then established between the standardized job title and the input job title.
-
公开(公告)号:US10255586B2
公开(公告)日:2019-04-09
申请号:US15199423
申请日:2016-06-30
发明人: Dan Shacham , Uri Merhav , Peide Zhong , Qi He , Angela Jiang
IPC分类号: G06Q10/00 , G06Q10/10 , G06Q50/00 , G06F16/9535 , G06F16/2457
摘要: An online social networking system receives an unstructured job title record from a profile of a member or a job posting. The system extracts a raw job title from the unstructured job title record, and extracts a first seniority level from the raw job title. The first seniority level is a seniority modifier associated with the raw job title. The system determines a second seniority level. The second seniority level is a company seniority within the company associated with the unstructured job title record. The system determines a third seniority level. The third seniority level is a seniority score for the member or the job posting. The system compares the seniority score with a second seniority score, and communicates with the member, or transmits the job posting to the member, based on the comparison of the seniority score and the second seniority score.
-
公开(公告)号:US20220391690A1
公开(公告)日:2022-12-08
申请号:US17340607
申请日:2021-06-07
发明人: Shuai Wang , Piede Zhong , Ji Yan , Feng Guo , Dan Shacham , Fei Chen
摘要: Described herein is a technique for mapping the raw text of a job title of an online job posting to an entity embedding, associated with an entity or entry of a title taxonomy. The raw text of the job title is first encoded to generate a multilingual word embedding in a multilingual word embedding space. Then, the vector representation of the job title, as represented in the multilingual word embedding space is translated, using a neural network, to a vector representation of the job title in the entity embedding space. Finally, a nearest neighbor search is performed to identify an entity embedding associated with an entity or entry in the title taxonomy that has a vector representation that is closest in distance to the vector output by the neural network.
-
公开(公告)号:US11188992B2
公开(公告)日:2021-11-30
申请号:US15366728
申请日:2016-12-01
发明人: Siyuan Zhang , Qin Iris Wang , Dan Shacham , Mohsen Jamali
摘要: A system and method for inferring appropriate courses for recommendation based on member characteristics is disclosed. A social networking system receives a request for recommended courses, wherein the request is associated with a member of the social networking system. The social networking system identifies a group of members who are similar to the first member. The social networking system creates a list of recently learned skills by members of the group of members similar to the member. For a particular skill in the list of skills, the social networking system determines whether the member possesses the particular skill. In accordance with a determination that the member does not possess the particular skill, the social networking system identifies at least one course that teaches the particular skill from a list of courses. The social networking system transmits the identified course to the client device for display as a recommended course.
-
-
-
-
-
-
-
-
-