Invention Grant
- Patent Title: Identification of topics in source code
- Patent Title (中): 识别源代码中的主题
-
Application No.: US12212534Application Date: 2008-09-17
-
Publication No.: US08209665B2Publication Date: 2012-06-26
- Inventor: Girish Maskeri Rama , Kenneth Heafield , Santonu Sarkar
- Applicant: Girish Maskeri Rama , Kenneth Heafield , Santonu Sarkar
- Applicant Address: IN Bangalore
- Assignee: Infosys Limited
- Current Assignee: Infosys Limited
- Current Assignee Address: IN Bangalore
- Agency: Klarquist Sparkman, LLP
- Main IPC: G06F9/44
- IPC: G06F9/44

Abstract:
Topics in source code can be identified using Latent Dirichlet Allocation (LDA) by receiving source code, identifying domain specific keywords from the source code, generating a keyword matrix, processing the keyword matrix and the source code using LDA, and outputting a list of topics. The list of topics is output as collections of domain specific keywords. Probabilities of domain specific keywords belonging to their respective topics can also be output. The keyword matrix comprises weighted sums of occurrences of domain specific keywords in the source code.
Public/Granted literature
- US20090254884A1 IDENTIFICATION OF TOPICS IN SOURCE CODE Public/Granted day:2009-10-08
Information query