发明申请
- 专利标题: CHARACTER STRING UPDATED DEGREE EVALUATION PROGRAM
- 专利标题(中): 字符更新学位评估计划
-
申请号: US12301224申请日: 2007-05-18
-
公开(公告)号: US20090226098A1公开(公告)日: 2009-09-10
- 发明人: Masayuki Takahashi , Yoshiki Mikami , T. Katsuko Nakahira
- 申请人: Masayuki Takahashi , Yoshiki Mikami , T. Katsuko Nakahira
- 申请人地址: JP Nagaoka-shi
- 专利权人: NAGAOKA UNIVERSITY OF TECHNOLOGY
- 当前专利权人: NAGAOKA UNIVERSITY OF TECHNOLOGY
- 当前专利权人地址: JP Nagaoka-shi
- 优先权: JP2006-140850 20060519
- 国际申请: PCT/JP2007/060240 WO 20070518
- 主分类号: G06K9/68
- IPC分类号: G06K9/68
摘要:
There is provided a character string updated degree evaluation program that enables quantitative grasping of an amount of intellectual work through editing and updating of character strings. A text subjected to comparison is divided into common part character strings each having a length greater than or equal to a threshold value, and non-common part character strings. A number of edited points from the original text and a context edit distance are calculated based on the rate of the common part character strings and the occurrence pattern thereof. A number of edited point is acquired from a number of elements contained in a common part character string set, and a context edit distance is acquired from a change in an order of occurrence of the common part character strings. Calculation of a new creation percentage and analysis by an N-gram are performed on the non-common part character string. The new creation percentage is acquired from the total length of the elements contained in a non-common part character string set, and a new creation novelty degree is acquired from a non-partial matching rate between a non-common part character string set and an element contained in the non-common part character string set. Calculations for the common part character string set and for the non-common part character string set are united, thereby calculating a text updated degree.
公开/授权文献
- US08244046B2 Character string updated degree evaluation program 公开/授权日:2012-08-14
信息查询