发明申请
US20160212153A1 Code Labeling Based on Tokenized Code Samples 审中-公开
基于令牌代码示例的代码标签

Code Labeling Based on Tokenized Code Samples
摘要:
Disclosed herein are systems and methods for detecting script code malware and generating signatures. A plurality of script code samples are received and transformed into a plurality of tokenized samples. The tokenized samples are based on syntactical elements of the plurality of script code samples. One or more clusters of samples are determined based on similarities in different ones of the plurality of tokenized samples, and known malicious code having a threshold similarity to a representative sample of the cluster of samples is identified. Based on the identifying, the cluster of samples is identified as malicious. Based at least on respective ones of the plurality of tokenized samples associated with the cluster of samples, a generalized code signature usable to identify the script code samples in the cluster of samples is generated.
公开/授权文献
信息查询
0/0