Abstract:
Execution traces are collected from multiple execution instances that exhibit performance issues such as slow execution. Call stacks are extracted from the execution traces, and the call stacks are mined to identify frequently occurring function call patterns. The call patterns are then clustered, and used to identify groups of execution instances whose performance issues may be caused by common problematic program execution patterns.
Abstract:
Execution traces are collected from multiple execution instances that exhibit performance issues such as slow execution. Call stacks are extracted from the execution traces, and the call stacks are mined to identify frequently occurring function call patterns. The call patterns are then clustered, and used to identify groups of execution instances whose performance issues may be caused by common problematic program execution patterns.
Abstract:
Techniques for detecting, analyzing, and/or reporting code clone are described herein. In one or more implementations, clone-code detection is performed on one or more source code bases to find true and near clones of a subject code snippet that a user (e.g., a software developer) expressly or implicitly selected. In one or more other implementations, code clone is analyzed to estimate the code-improvement-potential (such as bug-potential and code-refactoring-potential) properties of clones. One or more other implementations present the results of code clone analysis with indications (e.g., rankings) of the estimated properties of the respective the clones.
Abstract:
Techniques for error report processing are described herein. Error reports, received by a developer due to program crashes, may be organized into a plurality of “buckets.” The buckets may be based in part on a name and a version of the application associated with a crash. Additionally, a call stack of the computer on which the crash occurred may be associated with each error report. The error reports may be “re-bucketed” into meta-buckets to provide additional information to programmers working to resolve software errors. The re-bucketing may be based in part on measuring similarity of call stacks of a plurality of error reports. The similarity of two call stacks—a measure of likelihood that two error reports were caused by a same error—may be based in part on functions in common, a distance of those functions from the crash point, and an offset distance between the common functions.
Abstract:
A call pattern database is mined to identify frequently occurring call patterns related to program execution instances. An SVM classifier is iteratively trained based at least in part on classifications provided by human analysts; at each iteration, the SVM classifier identifies boundary cases, and requests human analysis of these cases. The trained SVM classifier is then applied to call pattern pairs to produce similarity measures between respective call patterns of each pair, and the call patterns are clustered based on the similarity measures.
Abstract:
A code verification system is described herein that provides augmented code review with code clone analysis and visualization to help software developers automatically identify similar instances of the same code and to visualize differences in versions of software code over time. The system uses code clone search technology to identify code clones and to present the user with information about similar code as the developer makes changes. The system may provide automated notification to the developer or to other teams as changes are made to code segments with one or more related clones. The code verification system also helps the developer to understand architectural evolution of a body of software code. The code verification system provides an analysis component for determining architectural differences based on the code clone detection result between the two versions of the software code base. The code verification system also provides a user interface component for displaying identified differences to developers and others involved with the software development process in intuitive and useful ways.
Abstract:
Techniques for error report processing are described herein. Error reports, received by a developer due to program crashes, may be organized into a plurality of “buckets.” The buckets may be based in part on a name and a version of the application associated with a crash. Additionally, a call stack of the computer on which the crash occurred may be associated with each error report. The error reports may be “re-bucketed” into meta-buckets to provide additional information to programmers working to resolve software errors. The re-bucketing may be based in part on measuring similarity of call stacks of a plurality of error reports. The similarity of two call stacks—a measure of likelihood that two error reports were caused by a same error—may be based in part on functions in common, a distance of those functions from the crash point, and an offset distance between the common functions.
Abstract:
Techniques for detecting, analyzing, and/or reporting code clone are described herein. In one or more implementations, clone-code detection is performed on one or more source code bases to find true and near clones of a subject code snippet that a user (e.g., a software developer) expressly or implicitly selected. In one or more other implementations, code clone is analyzed to estimate the code-improvement-potential (such as bug-potential and code-refactoring-potential) properties of clones. One or more other implementations present the results of code clone analysis with indications (e.g., rankings) of the estimated properties of the respective the clones.
Abstract:
A system for frequent pattern mining uses two layers of processing: a plurality of computing nodes, and a plurality of processors within each computing node. Within each computing node, the data set against which the frequent pattern mining is to be performed is stored in shared memory, accessible concurrently by each of the processors. The search space is partitioned among the computing nodes, and sub-partitioned among the processors of each computing node. If a processor completes its sub-partition, it requests another sub-partition. The partitioning and sub-partitioning may be performed dynamically, and adjusted in real time.
Abstract:
A system for frequent pattern mining uses two layers of processing: a plurality of computing nodes, and a plurality of processors within each computing node. Within each computing node, the data set against which the frequent pattern mining is to be performed is stored in shared memory, accessible concurrently by each of the processors. The search space is partitioned among the computing nodes, and sub-partitioned among the processors of each computing node. If a processor completes its sub-partition, it requests another sub-partition. The partitioning and sub-partitioning may be performed dynamically, and adjusted in real time.