-
公开(公告)号:US11675873B1
公开(公告)日:2023-06-13
申请号:US17809513
申请日:2022-06-28
Applicant: Lemon Inc.
Inventor: Han Wang , Hongyu Xiong , Zheng Chen , Tianyu Zhang , Yiqi Feng , Yuan Gao , Xiangyu Zeng , Rui Li , Qingyi Lu , Yihan Yang , Yu Zhang , Bin Liu
IPC: G06F16/958 , G06F16/954
CPC classification number: G06F16/958 , G06F16/954
Abstract: There are provided methods, devices, and computer program products for similarity determination. In a method, first and second access data are obtained for a first and a second group of users who access a first and a second website, respectively. A first and a second jump path are generated for the first and second groups of users based on the first and second access data, respectively. The first and second jump paths describe access history for the first and second groups of users among webpages in the first and second websites, respectively. A similarity is determined between the first and second websites based on the first and second jump paths. Here, access data are used for similarity determination and unvisited webpages are not considered in the similarity determination. Therefore, the computation workload may be lowered, and the noise caused by the unvisited webpages may be reduced.
-
公开(公告)号:US20230409678A1
公开(公告)日:2023-12-21
申请号:US17845956
申请日:2022-06-21
Applicant: LEMON INC.
Inventor: Hongyu XIONG , Yihan Yang
IPC: G06K9/62
CPC classification number: G06K9/6227 , G06K9/6256 , G06K9/6267
Abstract: A method is proposed for sample processing. A first label for a training sample in a plurality of training samples is mapped into a second label, the first label being represented in a first label space and the second label being represented in a second label space smaller than the first label space. A plurality of classification models are obtained based on the second label and the training sample, a classification model describing an association relationship between a sample and a classification of a label, represented in the second label space, for the sample. A predication model is generated based on the plurality of classification models, the predication model describing an association relationship between a sample and a label, represented in the first label space, for the sample. The long tail effect in the original label space may be alleviated in building the predication model.
-