-
公开(公告)号:US11580170B2
公开(公告)日:2023-02-14
申请号:US16670809
申请日:2019-10-31
Applicant: Google LLC
Inventor: Xuerui Wang , Daniel Li , Xiaodan Song , Jie Han , Rahul Sharma
IPC: G06F16/906 , G06F16/215 , G06K9/62
Abstract: Generating granular clusters for real-time processing is provided. The systems can identify tokens based on aggregating input from computing devices over a time interval. The systems can identify, based on metrics, a subset of tokens for cluster generation. The systems can generate, via a clustering technique, token clusters from the subset of the tokens, each of the token clusters comprising two or more tokens from the subset of the tokens. The systems can apply a de-duplication technique to each of the token clusters. The systems can apply a filtering technique to the token clusters to remove tokens erroneously grouped in a token cluster. The systems can assign, based on a selection process, a label for each of the token clusters. The systems can activate, based on a number of remaining tokens in each of the token clusters, a subset of the token clusters for real-time content selection.
-
公开(公告)号:US12086211B2
公开(公告)日:2024-09-10
申请号:US18167297
申请日:2023-02-10
Applicant: Google LLC
Inventor: Xuerui Wang , Daniel Li , Xiaodan Song , Jie Han , Rahul Sharma
IPC: G06F18/23211 , G06F16/215 , G06F16/906 , G06F18/23213
CPC classification number: G06F18/23211 , G06F16/215 , G06F16/906 , G06F18/23213
Abstract: Generating granular clusters for real-time processing is provided. The systems can identify tokens based on aggregating input from computing devices over a time interval. The systems can identify, based on metrics, a subset of tokens for cluster generation. The systems can generate, via a clustering technique, token clusters from the subset of the tokens, each of the token clusters comprising two or more tokens from the subset of the tokens. The systems can apply a de-duplication technique to each of the token clusters. The systems can apply a filtering technique to the token clusters to remove tokens erroneously grouped in a token cluster. The systems can assign, based on a selection process, a label for each of the token clusters. The systems can activate, based on a number of remaining tokens in each of the token clusters, a subset of the token clusters for real-time content selection.
-
公开(公告)号:US20210173869A1
公开(公告)日:2021-06-10
申请号:US16623096
申请日:2018-08-30
Applicant: Google LLC
Inventor: Feng Li , Xuerui Wang
IPC: G06F16/906 , G06F16/953
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for clustering data elements. In one aspect, a method includes determining a respective linkage value for each of multiple cluster pairs, where each cluster pair includes a respective first cluster and a respective second cluster. Determining a linkage value for a cluster pair includes determining a set of pairwise similarity values for the cluster pair. Each pairwise similarity value defines a similarity measure between: (i) a particular data element from the first cluster of the cluster pair, and (ii) a given data element from the second cluster of the cluster pair. The linkage value for the cluster pair is assigned as a given percentile of the set of pairwise similarity values, wherein the given percentile is greater than 0 and less than 100. A cluster pair is merged based on the linkage values of the cluster pairs.
-
公开(公告)号:US20230267176A1
公开(公告)日:2023-08-24
申请号:US18167297
申请日:2023-02-10
Applicant: Google LLC
Inventor: Xuerui Wang , Daniel Li , Xiaodan Song , Jie Han , Rahul Sharma
IPC: G06F18/23211 , G06F16/906 , G06F16/215 , G06F18/23213
CPC classification number: G06F18/23211 , G06F16/906 , G06F16/215 , G06F18/23213
Abstract: Generating granular clusters for real-time processing is provided. The systems can identify tokens based on aggregating input from computing devices over a time interval. The systems can identify, based on metrics, a subset of tokens for cluster generation. The systems can generate, via a clustering technique, token clusters from the subset of the tokens, each of the token clusters comprising two or more tokens from the subset of the tokens. The systems can apply a de-duplication technique to each of the token clusters. The systems can apply a filtering technique to the token clusters to remove tokens erroneously grouped in a token cluster. The systems can assign, based on a selection process, a label for each of the token clusters. The systems can activate, based on a number of remaining tokens in each of the token clusters, a subset of the token clusters for real-time content selection.
-
公开(公告)号:US11347812B2
公开(公告)日:2022-05-31
申请号:US16623096
申请日:2018-08-30
Applicant: Google LLC
Inventor: Feng Li , Xuerui Wang
IPC: G06F7/00 , G06F16/906 , G06F16/953 , G06K9/62
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for clustering data elements. In one aspect, a method includes determining a respective linkage value for each of multiple cluster pairs, where each cluster pair includes a respective first cluster and a respective second cluster. Determining a linkage value for a cluster pair includes determining a set of pairwise similarity values for the cluster pair. Each pairwise similarity value defines a similarity measure between: (i) a particular data element from the first cluster of the cluster pair, and (ii) a given data element from the second cluster of the cluster pair. The linkage value for the cluster pair is assigned as a given percentile of the set of pairwise similarity values, wherein the given percentile is greater than 0 and less than 100. A cluster pair is merged based on the linkage values of the cluster pairs.
-
-
-
-