发明申请
- 专利标题: TRANSFORMATION-BASED FRAMEWORK FOR RECORD MATCHING
- 专利标题(中): 用于记录匹配的基于变换的框架
-
申请号: US12031715申请日: 2008-02-15
-
公开(公告)号: US20090210418A1公开(公告)日: 2009-08-20
- 发明人: Arvind Arasu , Surajit Chaudhuri , Shriraghav Kaushik
- 申请人: Arvind Arasu , Surajit Chaudhuri , Shriraghav Kaushik
- 申请人地址: US WA Redmond
- 专利权人: MICROSOFT CORPORATION
- 当前专利权人: MICROSOFT CORPORATION
- 当前专利权人地址: US WA Redmond
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
A transformation-based record matching technique. The technique provides a flexible way to account for synonyms and more general forms of string equivalences when performing record matching by taking as explicit input user-defined transformation rules (such as, for example, the fact that “Robert” and “Bob” that are synonymous). The input string and user-defined transformation rules are used to generate a larger set of strings which are used when performing record matching. Both the input string and data elements in a database can be transformed using the user-defined transformation rules in order to generate a larger set of potential record matches. These potential record matches can then be subjected to a threshold test in order to determine one or more best matches. Additionally, signature-based similarity functions are used to improve the computational efficiency of the technique.
公开/授权文献
- US08032546B2 Transformation-based framework for record matching 公开/授权日:2011-10-04
信息查询