Semi-supervised learning via deep label propagation

    公开(公告)号:US10922609B2

    公开(公告)日:2021-02-16

    申请号:US15597290

    申请日:2017-05-17

    Applicant: Facebook, Inc.

    Abstract: In one embodiment, a system may access a graph data structure that includes nodes and connections between the nodes. Each node may be associated with a user; each connection between two nodes may represent a relationship between the associated users; and each node may be either labeled or unlabeled with respect to a label type. For each labeled node, a label of the label type of that labeled node may be propagated to other nodes through the connections. For each node, the system may store a label distribution information associated with the label type based on the propagated labels reaching the node. The system may train a machine-learning model using the labels and the label distribution information of a set of the labeled nodes. A predicted label for each unlabeled node may be generated using the model and the label distribution information of the unlabeled node.

    Semi-Supervised Learning via Deep Label Propagation

    公开(公告)号:US20180336457A1

    公开(公告)日:2018-11-22

    申请号:US15597290

    申请日:2017-05-17

    Applicant: Facebook, Inc.

    Abstract: In one embodiment, a system may access a graph data structure that includes nodes and connections between the nodes. Each node may be associated with a user; each connection between two nodes may represent a relationship between the associated users; and each node may be either labeled or unlabeled with respect to a label type. For each labeled node, a label of the label type of that labeled node may be propagated to other nodes through the connections. For each node, the system may store a label distribution information associated with the label type based on the propagated labels reaching the node. The system may train a machine-learning model using the labels and the label distribution information of a set of the labeled nodes. A predicted label for each unlabeled node may be generated using the model and the label distribution information of the unlabeled node.

    Identifying multiple languages in a content item

    公开(公告)号:US10180935B2

    公开(公告)日:2019-01-15

    申请号:US15422463

    申请日:2017-02-02

    Applicant: Facebook, Inc.

    Abstract: A system for identifying language(s) for content items is disclosed. The system can identify different languages for content item words segments by identifying segment languages that maximize a probability across the segments. The probability can be a combination of: an author's likelihood for the language identified for the first word; a combination of transition frequencies for selected languages identified for words, the transition frequencies indicating likelihoods that a transition occurred to the selected language from the previous word's language; and a combination of observation probabilities indicating, for a given word in the content item, a likelihood the given word is in the identified language. For an in-vocabulary word, the observation probabilities can be based on learned probability for that word. For an out-of-vocabulary word, the probability can be computed by breaking the word into overlapping n-grams and computing combined learned probabilities that each n-gram is in the given language.

    IDENTIFYING MULTIPLE LANGUAGES IN A CONTENT ITEM

    公开(公告)号:US20180189259A1

    公开(公告)日:2018-07-05

    申请号:US15422463

    申请日:2017-02-02

    Applicant: Facebook, Inc.

    CPC classification number: G06F17/275 G06F17/2294 G06F17/2775

    Abstract: A system for identifying language(s) for content items is disclosed. The system can identify different languages for content item words segments by identifying segment languages that maximize a probability across the segments. The probability can be a combination of: an author's likelihood for the language identified for the first word; a combination of transition frequencies for selected languages identified for words, the transition frequencies indicating likelihoods that a transition occurred to the selected language from the previous word's language; and a combination of observation probabilities indicating, for a given word in the content item, a likelihood the given word is in the identified language. For an in-vocabulary word, the observation probabilities can be based on learned probability for that word. For an out-of-vocabulary word, the probability can be computed by breaking the word into overlapping n-grams and computing combined learned probabilities that each n-gram is in the given language.

Patent Agency Ranking