Website similarity determination
    1.
    发明授权

    公开(公告)号:US11675873B1

    公开(公告)日:2023-06-13

    申请号:US17809513

    申请日:2022-06-28

    Applicant: Lemon Inc.

    CPC classification number: G06F16/958 G06F16/954

    Abstract: There are provided methods, devices, and computer program products for similarity determination. In a method, first and second access data are obtained for a first and a second group of users who access a first and a second website, respectively. A first and a second jump path are generated for the first and second groups of users based on the first and second access data, respectively. The first and second jump paths describe access history for the first and second groups of users among webpages in the first and second websites, respectively. A similarity is determined between the first and second websites based on the first and second jump paths. Here, access data are used for similarity determination and unvisited webpages are not considered in the similarity determination. Therefore, the computation workload may be lowered, and the noise caused by the unvisited webpages may be reduced.

    SAMPLE PROCESSING BASED ON LABEL MAPPING
    2.
    发明公开

    公开(公告)号:US20230409678A1

    公开(公告)日:2023-12-21

    申请号:US17845956

    申请日:2022-06-21

    Applicant: LEMON INC.

    CPC classification number: G06K9/6227 G06K9/6256 G06K9/6267

    Abstract: A method is proposed for sample processing. A first label for a training sample in a plurality of training samples is mapped into a second label, the first label being represented in a first label space and the second label being represented in a second label space smaller than the first label space. A plurality of classification models are obtained based on the second label and the training sample, a classification model describing an association relationship between a sample and a classification of a label, represented in the second label space, for the sample. A predication model is generated based on the plurality of classification models, the predication model describing an association relationship between a sample and a label, represented in the first label space, for the sample. The long tail effect in the original label space may be alleviated in building the predication model.

Patent Agency Ranking