Machine learning methods and systems for determining file risk using content disarm and reconstruction analysis

    公开(公告)号:US12118087B2

    公开(公告)日:2024-10-15

    申请号:US18548332

    申请日:2022-01-28

    CPC classification number: G06F21/56

    Abstract: File risk and malware detection and classification can be enhanced using machine learning analysis of content disarm and reconstruction (CDR) output. Correlations can be discovered or analyzed between individual elements of such outputs, which can include an XML report. Such correlations can provide useful information on threat intelligence and help validate content disarm and reconstruction. A method can include training machine learning algorithms with a dataset derived from CDR results from test files labelled as malicious or not malicious; instructing algorithms to predict probabilities; and determining correlation between the report items and malware (for example, using the function feature importances and the SHAP value method).

    Resisting the spread of unwanted code and data

    公开(公告)号:US10419456B2

    公开(公告)日:2019-09-17

    申请号:US15223257

    申请日:2016-07-29

    Abstract: A method or system of receiving an electronic file containing content data in a predetermined data format, the method comprising the steps of: receiving the electronic file, determining the data format, parsing the content data, to determine whether it conforms to the predetermined data format, and if the content data does conform to the predetermined data format, regenerating the parsed data to create a regenerated electronic file in the data format.

    Resisting the spread of unwanted code and data

    公开(公告)号:US11218495B2

    公开(公告)日:2022-01-04

    申请号:US16539716

    申请日:2019-08-13

    Abstract: A method for resisting spread of unwanted code and data without scanning incoming electronic files for unwanted code and data, the method comprising the steps, performed by a computer system, includes receiving, at the computer system, an incoming electronic file containing content data encoded and arranged in accordance with a predetermined file type corresponding to a set of rules, determining a purported predetermined file type of the incoming electronic file by analysing the encoded and arranged content data, the purported predetermined file type and the associated set of rules specifying allowable content data for the purported predetermined file type, parsing the content data by dividing the content data into separate parts in accordance with a predetermined data format identified by the associated set of rules corresponding to the purported predetermined file type and determining nonconforming data in the content data by identifying content data that does not conform to the purported predetermined file format, and if the separate parts of the content data do conform to the predetermined data format, regenerating the allowable parsed content data to create a substitute regenerated electronic file in the purported predetermined file type by extracting the separate parts that do conform and putting them into the substitute regenerated electronic file.

    Resisting the spread of unwanted code and data

    公开(公告)号:US10462163B2

    公开(公告)日:2019-10-29

    申请号:US16261143

    申请日:2019-01-29

    Abstract: A method or system of receiving an incoming electronic file containing content data in a predetermined data format, the method including receiving an incoming electronic file containing content data encoded and arranged in accordance with a predetermined file type, determining a purported predetermined file type of the incoming electronic file and an associated set of rules specifying allowable content data, determining at least an allowable portion of the content data that conforms with the set of rules corresponding to the determined purported predetermined file type, extracting, from the incoming electronic file, the at least an allowable portion of content data, creating a substitute electronic file in the purported predetermined file type, said substitute electronic file containing the extracted allowable content data, forwarding the substitute regenerated electronic file, and forwarding the incoming electronic file if a portion, part or whole of the content data does not conform, only when the intended recipient approves the electronic file at the time of receipt.

    RESISTING THE SPREAD OF UNWANTED CODE AND DATA

    公开(公告)号:US20190158518A1

    公开(公告)日:2019-05-23

    申请号:US16261143

    申请日:2019-01-29

    Abstract: A method or system of receiving an incoming electronic file containing content data in a predetermined data format, the method including receiving an incoming electronic file containing content data encoded and arranged in accordance with a predetermined file type, determining a purported predetermined file type of the incoming electronic file and an associated set of rules specifying allowable content data, determining at least an allowable portion of the content data that conforms with the set of rules corresponding to the determined purported predetermined file type, extracting, from the incoming electronic file, the at least an allowable portion of content data, creating a substitute electronic file in the purported predetermined file type, said substitute electronic file containing the extracted allowable content data, forwarding the substitute regenerated electronic file, and forwarding the incoming electronic file if a portion, part or whole of the content data does not conform, only when the intended recipient approves the electronic file at the time of receipt.

    MACHINE LEARNING METHODS AND SYSTEMS FOR DETERMINING FILE RISK USING CONTENT DISARM AND RECONSTRUCTION ANALYSIS

    公开(公告)号:US20240273198A1

    公开(公告)日:2024-08-15

    申请号:US18548332

    申请日:2022-01-28

    CPC classification number: G06F21/56

    Abstract: File risk and malware detection and classification can be enhanced using machine learning analysis of content disarm and reconstruction (CDR) output. Correlations can be discovered or analyzed between individual elements of such outputs, which can include an XML report. Such correlations can provide useful information on threat intelligence and help validate content disarm and reconstruction. A method can include training machine learning algorithms with a dataset derived from CDR results from test files labelled as malicious or not malicious: instructing algorithms to predict probabilities; and determining correlation between the report items and malware (for example, using the function feature importances and the SHAP value method).

    Resisting the spread of unwanted code and data

    公开(公告)号:US11799881B2

    公开(公告)日:2023-10-24

    申请号:US17646371

    申请日:2021-12-29

    Abstract: A method for resisting spread of unwanted code and data without scanning incoming electronic files for unwanted code and data, the method comprising the steps, performed by a computer system, includes receiving, at the computer system, an incoming electronic file containing content data encoded and arranged in accordance with a predetermined file type corresponding to a set of rules, determining a purported predetermined file type of the incoming electronic file by analysing the encoded and arranged content data, the purported predetermined file type and the associated set of rules specifying allowable content data for the purported predetermined file type, parsing the content data by dividing the content data into separate parts in accordance with a predetermined data format identified by the associated set of rules corresponding to the purported predetermined file type and determining nonconforming data in the content data by identifying content data that does not conform to the purported predetermined file format, and if the separate parts of the content data do conform to the predetermined data format, regenerating the allowable parsed content data to create a substitute regenerated electronic file in the purported predetermined file type by extracting the separate parts that do conform and putting them into the substitute regenerated electronic file.

Patent Agency Ranking