Training data quality for spam classification

    公开(公告)号:US11232369B1

    公开(公告)日:2022-01-25

    申请号:US15698797

    申请日:2017-09-08

    Applicant: Facebook, Inc.

    Abstract: In one embodiment, a method includes accessing posts in a social-networking system. Each of the posts is unlabeled with respect to whether the post is known to be spam. The method also includes determining a posting user who submitted the post to the social-networking system and a recipient user to whom the post is addressed. The method further includes determining a first vector representation of the posting user and a second vector representation of the recipient user based on one or more features associated with the post, the posting user, and the recipient user. The method still further includes comparing the vector representations and building a machine learning model for automatically detecting spam posts in the social-networking system using a subset of the plurality of posts as non-spam training data.

    TRAINING DATA QUALITY FOR SPAM CLASSIFICATION

    公开(公告)号:US20220101203A1

    公开(公告)日:2022-03-31

    申请号:US17550280

    申请日:2021-12-14

    Applicant: Facebook, Inc.

    Abstract: In one embodiment, a method includes accessing posts in a social-networking system. Each of the posts is unlabeled with respect to whether the post is known to be spam. The method also includes determining a posting user who submitted the post to the social-networking system and a recipient user to whom the post is addressed. The method further includes determining a first vector representation of the posting user and a second vector representation of the recipient user based on one or more features associated with the post, the posting user, and the recipient user. The method still further includes comparing the vector representations and building a machine learning model for automatically detecting spam posts in the social-networking system using a subset of the plurality of posts as non-spam training data.

Patent Agency Ranking