SYSTEM FOR GENERATING SAMPLES TO GENERATE MACHINE LEARNING MODELS TO FACILITATE DETECTION OF SUSPICIOUS DIGITAL IDENTIFIERS

    公开(公告)号:US20240340314A1

    公开(公告)日:2024-10-10

    申请号:US18486995

    申请日:2023-10-13

    申请人: Lookout, Inc.

    IPC分类号: H04L9/40 G06N20/00

    摘要: A system for generating samples to generate machine learning models to detect suspicious digital identifiers is disclosed. The system creates a novel balanced-categorized sample generation mechanism for building machine learning models so that the samples are balanced and not biased to any particular class label, such as suspicious or non-suspicious. The system initiates training of a machine learning model and obtains a labeled dataset containing samples verified as suspicious or non-suspicious. The system computes, based on a configuration to generate a balanced labeled dataset, a sampling weight for the samples. Using the computed sampling weight, the system performs sampling on the suspicious and non-suspicious samples over a time period. The system merges the sampled suspicious and non-suspicious samples to form a balanced labeled dataset and generates categorized labeled samples therefrom. The categorized labeled samples are utilized to train machine learning models to identify whether a digital identifier is suspicious.

    MACHINE LEARNING SYSTEM FOR AUTOMATED DETECTION OF SUSPICIOUS DIGITAL IDENTIFIERS

    公开(公告)号:US20240340312A1

    公开(公告)日:2024-10-10

    申请号:US18295766

    申请日:2023-04-04

    申请人: Lookout, Inc.

    IPC分类号: H04L9/40

    CPC分类号: H04L63/1483 H04L63/1408

    摘要: A machine learning system for providing automated detection of suspicious digital identifiers is disclosed. The system receives a request to determine if an identifier associated with a resource attempting to be accessed by a device is suspicious. In response to the request, the system selects a machine learning model and loads or computes features associated with the address to facilitate determination regarding suspiciousness of the digital identifier. The system executes the machine learning model utilizing the features to determine whether the digital identifier is suspicious. The determination regarding suspiciousness of the digital identifier is provided to a phishing and content protection classifier to persist the response in a database. The determination may be verified by an expert and may be utilized to prevent access to the resource associated with the identifier and to train the machine learning model to enhance future determinations relating to suspiciousness of digital identifiers.

    SYSTEM FOR AUTOMATED MODEL SELECTION TO FACILITATE DETECTION OF SUSPICIOUS DIGITAL IDENTIFIERS

    公开(公告)号:US20240338576A1

    公开(公告)日:2024-10-10

    申请号:US18471099

    申请日:2023-09-20

    申请人: Lookout, Inc.

    发明人: Aungon Nag Radon

    IPC分类号: G06N3/0985 H04L41/16

    CPC分类号: G06N3/0985 H04L41/16

    摘要: A system for providing automated model generation to facilitate automated detection of suspicious digital identifiers is disclosed. The system trains, during a training process, a plurality of trainable machine learning models using a labeled dataset containing data verified as suspicious or non-suspicious to generate a plurality of trained machine learning models based on candidate machine learning algorithms. The system generates an optimal machine learning model from the plurality of trainable machine learning models. The optimal machine learning model can have an optimal combination of hyperparameters and an optimal model parameter combination learned via the training process using the optimal hyperparameter combination. The optimal machine learning model has a highest performance for suspiciousness determination according to a performance metric when compared to other trained machine learning models. The system can receive a request to determine whether an identifier is suspicious and utilizes the optimal machine learning model to perform the determination.