-
公开(公告)号:US12032687B2
公开(公告)日:2024-07-09
申请号:US17491438
申请日:2021-09-30
发明人: Jack Wilson Stokes, III , Jonathan Bar Or , Christian Seifert , Talha Ongun , Farid Tajaddodianfar
IPC分类号: G06F21/55 , G06F18/214 , G06F21/54 , G06F21/56 , G06N20/00
CPC分类号: G06F21/554 , G06F18/214 , G06F21/54 , G06F21/566 , G06N20/00
摘要: The techniques disclosed herein enable systems to train a machine learning model to classify malicious command line strings and select anomalous and uncertain samples for analysis. To train the machine learning model, a system receives a labeled data set containing command line inputs that are known to be malicious or benign. Utilizing a term embedding model, the system can generate aggregated numerical representations of the command line inputs for analysis by the machine learning model. The aggregated numerical representations can include various information such as term scores that represent a probability that an individual term of the command line string is malicious as well as numerical representations of the individual terms. The system can subsequently provide the aggregated numerical representations to the machine learning model for analysis. Based on the aggregated numerical representations, the machine learning model can learn to distinguish malicious command line inputs from benign inputs.
-
公开(公告)号:US11762990B2
公开(公告)日:2023-09-19
申请号:US16917626
申请日:2020-06-30
IPC分类号: G06F21/55 , G06F16/955 , G06N5/04 , G06N20/00
CPC分类号: G06F21/554 , G06F16/9566 , G06N5/04 , G06N20/00 , G06F2221/034
摘要: The technology described herein identifies malicious URLs using a classifier that is both accurate and fast. Aspects of the technology are particularly well adapted for use as a real-time URL security analysis tool because the technology is able to quickly process a URL and produce a warning when a malicious URL is identified. The rapid processing speed of the technology described herein is produced, in part, by use of only a single input signal, which is the URL itself. The high accuracy produced by the technology described herein is achieved by analyzing the unstructured text on both a character-by-character level and a word-by-word level. The technology described herein uses both character-level and word-level information from the incoming URL.
-