SAMPLE-DIFFERENCE-BASED METHOD AND SYSTEM FOR INTERPRETING DEEP-LEARNING MODEL FOR CODE CLASSIFICATION

    公开(公告)号:US20240192929A1

    公开(公告)日:2024-06-13

    申请号:US18475447

    申请日:2023-09-27

    CPC classification number: G06F8/35 G06F8/42

    Abstract: A sample-difference-based method and system for interpreting a deep-learning model for code classification is provided, wherein the method includes a step of off-line training an interpreter: constructing code transformation for every code sample in a training set to generate difference samples; generating difference samples respectively through feature deletion and code snippets extraction and then calculating feature importance scores accordingly; and inputting the original code samples, the difference samples and the feature importance scores into a neural network to get a trained interpreter; and a step of on-line interpreting the code samples: using the trained interpreter to extract important features from the snippets, then using an influence-function-based method to identify training samples that are most contributive to prediction, comparing the obtained important features and the most contributive training samples, and generating interpretation results for the object samples. The inventive system includes an off-line training module and an on-line interpretation module.

Patent Agency Ranking