-
公开(公告)号:US11977632B2
公开(公告)日:2024-05-07
申请号:US17238248
申请日:2021-04-23
Applicant: Booz Allen Hamilton Inc.
Inventor: Robert J. Joyce , Edward Raff
IPC: G06F21/56 , G06F18/23 , G06F18/2431
CPC classification number: G06F21/564 , G06F18/23 , G06F18/2431
Abstract: Disclosed are methods and apparatuses for classifier evaluation. The evaluation involves constructing a ground truth refinement having a degree of error within specified bounds from a malware reference dataset as an approximate ground truth refinement. The evaluation further involves using the approximate ground truth refinement to determine at least one of: a lower bound on precision or an upper bound on recall and accuracy. The evaluation further involves evaluating a classifier by evaluating at least one of a classification method or clustering method by examining changes to the upper bound and/or the lower bound produced by the approximate ground truth refinement.
-
公开(公告)号:US20240303331A1
公开(公告)日:2024-09-12
申请号:US18475601
申请日:2023-09-27
Applicant: Booz Allen Hamilton Inc.
Inventor: Robert J. Joyce , Edward Simon Pastor Raff
IPC: G06F21/56 , G06N3/0442
CPC classification number: G06F21/561 , G06N3/0442 , G06F2221/034
Abstract: Provided are methods, systems, and non-transitory computer-readable media for generating a feature vector for malware, including storing, in memory of a computing device, program code for a trained neural network that produces embedded representations for antivirus scan data; executing, by a processor of the computing device, the program code for the trained neural network to perform the operations of: (a) receiving an antivirus scan report (AVSR) for a malware file; (b) normalizing each label in the AVSR by separating the label into a sequence of tokens including a set of token strings; (c) embedding a first token and plural second tokens to generate an input sequence for the malware file; (d) inputting the input sequence into a neural model for producing antivirus scan data; and (e) outputting the antivirus scan data produced by the neural model as one or more feature vectors.
-