-
1.
公开(公告)号:US20240192929A1
公开(公告)日:2024-06-13
申请号:US18475447
申请日:2023-09-27
Inventor: Zhen LI , Ruqian ZHANG , Deqing ZOU , Hai JIN , Yangrui LI
Abstract: A sample-difference-based method and system for interpreting a deep-learning model for code classification is provided, wherein the method includes a step of off-line training an interpreter: constructing code transformation for every code sample in a training set to generate difference samples; generating difference samples respectively through feature deletion and code snippets extraction and then calculating feature importance scores accordingly; and inputting the original code samples, the difference samples and the feature importance scores into a neural network to get a trained interpreter; and a step of on-line interpreting the code samples: using the trained interpreter to extract important features from the snippets, then using an influence-function-based method to identify training samples that are most contributive to prediction, comparing the obtained important features and the most contributive training samples, and generating interpretation results for the object samples. The inventive system includes an off-line training module and an on-line interpretation module.
-
2.
公开(公告)号:US20250013463A1
公开(公告)日:2025-01-09
申请号:US18650290
申请日:2024-04-30
Inventor: Zhen LI , Junyao YE , Deqing ZOU , Hai JIN , Xianghong ZENG
IPC: G06F8/73 , G06V10/764
Abstract: A method, system and processor for enhancing robustness of a source-code classification model based on invariant features is provided, wherein the method includes: combining non-robustness features to generate different style templates, converting codes in an input code training set into new codes of different styles to obtain a converted-code training set, merging the input-code and the converted-code training set into an expanded training set, and converting code texts in the expanded training set into code images; and converting the code images into required vectors, pairing samples of identical class randomly picked from the expanded training set and inputting the matched sample pairs into a feature extractor, iteratively updating the feature extractor and the matched sample pairs and extracting target characteristics, and training the extracted invariant features in a classifier to produce a trained model. The disclosed system includes a training set-expanding module and a model-training module.
-