METHODS AND APPARATUS FOR DETECTION OF MALICIOUS DOCUMENTS USING MACHINE LEARNING

    公开(公告)号:US20190236273A1

    公开(公告)日:2019-08-01

    申请号:US16257749

    申请日:2019-01-25

    申请人: Sophos Limited

    摘要: An apparatus for detecting malicious files includes a memory and a processor communicatively coupled to the memory. The processor receives multiple potentially malicious files. A first potentially malicious file has a first file format, and a second potentially malicious file has a second file format different than the first file format. The processor extracts a first set of strings from the first potentially malicious file, and extracts a second set of strings from the second potentially malicious file. First and second feature vectors are defined based on lengths of each string from the associated set of strings. The processor provides the first feature vector as an input to a machine learning model to produce a maliciousness classification of the first potentially malicious file, and provides the second feature vector as an input to the machine learning model to produce a maliciousness classification of the second potentially malicious file.

    DATA PROCESSING METHOD AND DATA PROCESSING DEVICE

    公开(公告)号:US20190220710A1

    公开(公告)日:2019-07-18

    申请号:US16362186

    申请日:2019-03-22

    IPC分类号: G06K9/62 G06N20/20

    摘要: A data processing method includes: generating at least one incremental decision tree according to incremental data; predicting the incremental data based on multiple model decision trees in a classification model and the at least one incremental decision tree to obtain prediction results; and updating the classification model according to the prediction results. In the data processing method according to an embodiment of the present invention, by generating the at least one incremental decision tree according to the incremental data, and then predicting the incremental data based on the model decision trees in the classification model and the at least one incremental decision tree, and updating the classification model according to the prediction results, a self-adaptive update of the classification model is achieved, and a manual intervention during a business cycle of the classification model is not needed, so that the cost is saved greatly.

    Always-On Keyword Detector
    6.
    发明申请

    公开(公告)号:US20190206391A1

    公开(公告)日:2019-07-04

    申请号:US16235396

    申请日:2018-12-28

    申请人: SYNTIANT

    摘要: Provided herein is an integrated circuit including, in some embodiments, a special-purpose host processor, a neuromorphic co-processor, and a communications interface between the host processor and the co-processor configured to transmit information therebetween. The special-purpose host processor is operable as a stand-alone host processor. The neuromorphic co-processor includes an artificial neural network. The co-processor is configured to enhance special-purpose processing of the host processor through the artificial neural network. In such embodiments, the host processor is a keyword identifier processor configured to transmit one or more detected words to the co-processor over the communications interface. The co-processor is configured to transmit recognized words, or other sounds, to the host processor.

    CLASSIFYING WARNING MESSAGES GENERATED BY SOFTWARE DEVELOPER TOOLS

    公开(公告)号:US20190102277A1

    公开(公告)日:2019-04-04

    申请号:US15725250

    申请日:2017-10-04

    IPC分类号: G06F11/36 G06N99/00

    摘要: A method for classifying warning messages generated by software developer tools includes receiving a first data set. The first data set includes a first plurality of data entries, where each data entry is associated with a warning message generated based on a first set of software codes, includes indications for a plurality of features, and is associated with one of a plurality of class labels. A second data set is generated by sampling the first data set. Based on the second data set, at least one feature is selected from the plurality of features. A third data set is generated by filtering the second data set with the selected at least one feature. A machine learning classifier is determined based on the third data set. The machine learning classifier is used to classify a second warning message generated based on a second set of software codes to one of the plurality of class labels.