MALWARE DETECTION USING LOCAL COMPUTATIONAL MODELS

    公开(公告)号:US20190026466A1

    公开(公告)日:2019-01-24

    申请号:US15657379

    申请日:2017-07-24

    申请人: CrowdStrike, Inc.

    IPC分类号: G06F21/56 G06F21/55

    摘要: Example techniques herein determine that a trial data stream is associated with malware (“dirty”) using a local computational model (CM). The data stream can be represented by a feature vector. A control unit can receive a first, dirty feature vector (e.g., a false miss) and determine the local CM based on the first feature vector. The control unit can receive a trial feature vector representing the trial data stream. The control unit can determine that the trial data stream is dirty if a broad CM or the local CM determines that the trial feature vector is dirty. In some examples, the local CM can define a dirty region in a feature space. The control unit can determine the local CM based on the first feature vector and other clean or dirty feature vectors, e.g., a clean feature vector nearest to the first feature vector.

    Validation-based determination of computational models

    公开(公告)号:US10826934B2

    公开(公告)日:2020-11-03

    申请号:US15402503

    申请日:2017-01-10

    申请人: CrowdStrike, Inc.

    IPC分类号: H04L29/06 G06N20/00 G06F21/56

    摘要: Example techniques described herein determine a validation dataset, determine a computational model using the validation dataset, or determine a signature or classification of a data stream such as a file. The classification can indicate whether the data stream is associated with malware. A processing unit can determine signatures of individual training data streams. The processing unit can determine, based at least in part on the signatures and a predetermined difference criterion, a training set and a validation set of the training data streams. The processing unit can determine a computational model based at least in part on the training set. The processing unit can then operate the computational model based at least in part on a trial data stream to provide a trial model output. Some examples include determining the validation set based at least in part on the training set and the predetermined criterion for difference between data streams.

    CLASSIFICATION OF SOURCE DATA BY NEURAL NETWORK PROCESSING

    公开(公告)号:US20190273509A1

    公开(公告)日:2019-09-05

    申请号:US15909372

    申请日:2018-03-01

    申请人: CrowdStrike, Inc.

    摘要: Example techniques described herein determine a classification of a variable-length source data such as an executable code. A neural network system that includes a convolution filter, a recurrent neural network, and a fully connected layer can be configured in a computing device to classify executable code. The neural network system can receive executable code of variable length and reduce its dimensionality by generating a variable-length sequence of features extracted from the executable code. The sequence of features is filtered, and applied to one or more recurrent neural networks and to a neural network. The output of the neural network classifies the data. Other disclosed systems include a system for reducing the dimensionality of command line input using a recurrent neural network. The reduced dimensionality of command line input may be classified using the disclosed neural network systems.

    Validation-based determination of computational models

    公开(公告)号:US11811821B2

    公开(公告)日:2023-11-07

    申请号:US17087194

    申请日:2020-11-02

    申请人: CrowdStrike, Inc.

    IPC分类号: H04L9/40 G06N20/00 G06F21/56

    摘要: Example techniques described herein determine a validation dataset, determine a computational model using the validation dataset, or determine a signature or classification of a data stream such as a file. The classification can indicate whether the data stream is associated with malware. A processing unit can determine signatures of individual training data streams. The processing unit can determine, based at least in part on the signatures and a predetermined difference criterion, a training set and a validation set of the training data streams. The processing unit can determine a computational model based at least in part on the training set. The processing unit can then operate the computational model based at least in part on a trial data stream to provide a trial model output. Some examples include determining the validation set based at least in part on the training set and the predetermined criterion for difference between data streams.

    Computational modeling and classification of data streams

    公开(公告)号:US10832168B2

    公开(公告)日:2020-11-10

    申请号:US15402524

    申请日:2017-01-10

    申请人: CrowdStrike, Inc.

    摘要: Example techniques described herein determine a signature or classification of a data stream such as a file. The classification can indicate whether the data stream is associated with malware. A processor can locate training analysis regions of training data streams based on predetermined structure data, and determining training model inputs based on the training analysis regions. The processor can determine a computational model based on the training model inputs. The computational model can receive an input vector and provide a corresponding feature vector. The processor can then locate a trial analysis region of a trial data stream based on the predetermined structure data and determine a trial model input. The processor can operate the computational model based on the trial model input to provide a trial feature vector, e.g., a signature. The processor can operate a second computational model to provide a classification based on the signature.

    VALIDATION-BASED DETERMINATION OF COMPUTATIONAL MODELS

    公开(公告)号:US20210075798A1

    公开(公告)日:2021-03-11

    申请号:US17087194

    申请日:2020-11-02

    申请人: CrowdStrike, Inc.

    IPC分类号: H04L29/06 G06N20/00 G06F21/56

    摘要: Example techniques described herein determine a validation dataset, determine a computational model using the validation dataset, or determine a signature or classification of a data stream such as a file. The classification can indicate whether the data stream is associated with malware. A processing unit can determine signatures of individual training data streams. The processing unit can determine, based at least in part on the signatures and a predetermined difference criterion, a training set and a validation set of the training data streams. The processing unit can determine a computational model based at least in part on the training set. The processing unit can then operate the computational model based at least in part on a trial data stream to provide a trial model output. Some examples include determining the validation set based at least in part on the training set and the predetermined criterion for difference between data streams.

    VALIDATION-BASED DETERMINATION OF COMPUTATIONAL MODELS

    公开(公告)号:US20180198800A1

    公开(公告)日:2018-07-12

    申请号:US15402503

    申请日:2017-01-10

    申请人: CrowdStrike, Inc.

    IPC分类号: H04L29/06 G06N99/00

    摘要: Example techniques described herein determine a validation dataset, determine a computational model using the validation dataset, or determine a signature or classification of a data stream such as a file. The classification can indicate whether the data stream is associated with malware. A processing unit can determine signatures of individual training data streams. The processing unit can determine, based at least in part on the signatures and a predetermined difference criterion, a training set and a validation set of the training data streams. The processing unit can determine a computational model based at least in part on the training set. The processing unit can then operate the computational model based at least in part on a trial data stream to provide a trial model output. Some examples include determining the validation set based at least in part on the training set and the predetermined criterion for difference between data streams.

    Malware detection using local computational models

    公开(公告)号:US10726128B2

    公开(公告)日:2020-07-28

    申请号:US15657379

    申请日:2017-07-24

    申请人: CrowdStrike, Inc.

    摘要: Example techniques herein determine that a trial data stream is associated with malware (“dirty”) using a local computational model (CM). The data stream can be represented by a feature vector. A control unit can receive a first, dirty feature vector (e.g., a false miss) and determine the local CM based on the first feature vector. The control unit can receive a trial feature vector representing the trial data stream. The control unit can determine that the trial data stream is dirty if a broad CM or the local CM determines that the trial feature vector is dirty. In some examples, the local CM can define a dirty region in a feature space. The control unit can determine the local CM based on the first feature vector and other clean or dirty feature vectors, e.g., a clean feature vector nearest to the first feature vector.

    CLASSIFICATION OF SOURCE DATA BY NEURAL NETWORK PROCESSING

    公开(公告)号:US20190273510A1

    公开(公告)日:2019-09-05

    申请号:US15909442

    申请日:2018-03-01

    申请人: CrowdStrike, Inc.

    摘要: Example techniques described herein determine a classification of a variable-length source data such as an executable code. A neural network system that includes a convolution filter, a recurrent neural network, and a fully connected layer can be configured in a computing device to classify executable code. The neural network system can receive executable code of variable length and reduce its dimensionality by generating a variable-length sequence of features extracted from the executable code. The sequence of features is filtered, and applied to one or more recurrent neural networks and to a neural network. The output of the neural network classifies the data. Other disclosed systems include a system for reducing the dimensionality of command line input using a recurrent neural network. The reduced dimensionality of command line input may be classified using the disclosed neural network systems.

    COMPUTATIONAL MODELING AND CLASSIFICATION OF DATA STREAMS

    公开(公告)号:US20180197089A1

    公开(公告)日:2018-07-12

    申请号:US15402524

    申请日:2017-01-10

    申请人: CrowdStrike, Inc.

    摘要: Example techniques described herein determine a signature or classification of a data stream such as a file. The classification can indicate whether the data stream is associated with malware. A processor can locate training analysis regions of training data streams based on predetermined structure data, and determining training model inputs based on the training analysis regions. The processor can determine a computational model based on the training model inputs. The computational model can receive an input vector and provide a corresponding feature vector. The processor can then locate a trial analysis region of a trial data stream based on the predetermined structure data and determine a trial model input. The processor can operate the computational model based on the trial model input to provide a trial feature vector, e.g., a signature. The processor can operate a second computational model to provide a classification based on the signature.