Evaluating automatic malware classifiers in the absence of reference labels

    公开(公告)号:US11977632B2

    公开(公告)日:2024-05-07

    申请号:US17238248

    申请日:2021-04-23

    CPC classification number: G06F21/564 G06F18/23 G06F18/2431

    Abstract: Disclosed are methods and apparatuses for classifier evaluation. The evaluation involves constructing a ground truth refinement having a degree of error within specified bounds from a malware reference dataset as an approximate ground truth refinement. The evaluation further involves using the approximate ground truth refinement to determine at least one of: a lower bound on precision or an upper bound on recall and accuracy. The evaluation further involves evaluating a classifier by evaluating at least one of a classification method or clustering method by examining changes to the upper bound and/or the lower bound produced by the approximate ground truth refinement.

    SYSTEM AND METHOD FOR CONVERTING ANTIVIRUS SCAN TO A FEATURE VECTOR

    公开(公告)号:US20240303331A1

    公开(公告)日:2024-09-12

    申请号:US18475601

    申请日:2023-09-27

    CPC classification number: G06F21/561 G06N3/0442 G06F2221/034

    Abstract: Provided are methods, systems, and non-transitory computer-readable media for generating a feature vector for malware, including storing, in memory of a computing device, program code for a trained neural network that produces embedded representations for antivirus scan data; executing, by a processor of the computing device, the program code for the trained neural network to perform the operations of: (a) receiving an antivirus scan report (AVSR) for a malware file; (b) normalizing each label in the AVSR by separating the label into a sequence of tokens including a set of token strings; (c) embedding a first token and plural second tokens to generate an input sequence for the malware file; (d) inputting the input sequence into a neural model for producing antivirus scan data; and (e) outputting the antivirus scan data produced by the neural model as one or more feature vectors.

Patent Agency Ranking