Privacy-Preserving Learning and Analytics of a Shared Embedding Space Across Multiple Separate Data Silos

    公开(公告)号:US20240346367A1

    公开(公告)日:2024-10-17

    申请号:US18300926

    申请日:2023-04-14

    Applicant: Google LLC

    CPC classification number: G06N20/00

    Abstract: Provided are systems and methods for privacy-preserving learning and analytics of a shared embedding space for data split across multiple separate data silos. A central computing system can generate a plurality of synthetic data examples having respective feature data within an aggregate feature-space that represents an aggregation of different component feature-spaces associated with the multiple separate data silos. The synthetic data examples can be used by different computing systems associated with the data silos to generate embeddings within a shared embedding space. Once the embeddings have been generated in the shared embedding space, multiple different types of analytics can be performed on the shared embedding space. As one example, the multiple data silos can correspond to multiple separate entity domains and an analysis of embeddings generated in the shared embedding space can be used to facilitate identification or classification of malicious actors across the multiple separate entity domains.

    DATA SAMPLING USING LOCALITY SENSITIVE HASHING FOR LARGE SCALE GRAPH LEARNING

    公开(公告)号:US20250045636A1

    公开(公告)日:2025-02-06

    申请号:US18794578

    申请日:2024-08-05

    Applicant: Google LLC

    Abstract: A method and systems are disclosed for data sampling using locality sensitive hashing. Training data set comprising a plurality of data points is received. Each data point of the plurality of data points is assigned to a hash bucket of a set of hash buckets associated with a set of hash functions. A sample set of data points is generated by sampling data points from each bucket of the set of hash buckets. Each sample data point pair comprises a pair of data points from the sample set of data points. An artificial intelligence (AI) model to output a numerical value that produces a degree of similarity between an input pair of data points is trained using the plurality of sample data point pairs. A data structure representing relationships between data points of the plurality of data points is generated using the trained AI model and the training data set.

    Federated Privacy-Preserving Nearest-Neighbor Search (NNS)-Based Label Propagation on Shared Embedding Space

    公开(公告)号:US20250005149A1

    公开(公告)日:2025-01-02

    申请号:US18343132

    申请日:2023-06-28

    Applicant: Google LLC

    Abstract: For a plurality of iterations, entity detection information is obtained from one or more client computing devices. The entity detection information includes (a) information that indicates whether an entity detected at the client computing device is malicious, and (b) information that associates the entity with a particular subspace of a plurality of subspaces of an embedding space. The entity detection information received over the plurality of iterations is aggregated to obtain aggregated threat information, wherein the aggregated threat information is descriptive of a number of malicious entities and a total number of entities detected for each subspace of the plurality of subspaces. Based on the entity detection information subspace classification information is generated that identifies a first subspace of the plurality of subspaces as being a malicious subspace associated with malicious entities.

    Detecting zero-day attacks with unknown signatures via mining correlation in behavioral change of entities over time

    公开(公告)号:US11159564B2

    公开(公告)日:2021-10-26

    申请号:US16464779

    申请日:2018-06-28

    Applicant: Google LLC

    Inventor: Animesh Nandi

    Abstract: Zero-day attacks with unknown attack signatures are detected by correlating behavior differences of a plurality of entities. An entity baseline behavior for each entity of the plurality of entities is determined 310, the entity baseline behavior includes multiple variables. An entity behavior difference for each entity is determined at a series of points in time 320. Correlations between the entity behavior differences for the plurality of entities are determined at the series of points in time 330. Based on these correlations, it is determined whether the plurality of entities is exhibiting coordinated behavior differences 340. An attack signature is determined based on the entity behavior differences and the correlations 350. A database of attack signatures is generated 360.

Patent Agency Ranking