Method of Linking Electronic Database Records
    6.
    发明申请
    Method of Linking Electronic Database Records 审中-公开
    链接电子数据库记录的方法

    公开(公告)号:US20130238623A1

    公开(公告)日:2013-09-12

    申请号:US13577195

    申请日:2011-02-03

    IPC分类号: G06F17/30

    摘要: A method of linking electronic database records, wherein each record is associated with a single member from a set of unique members, and wherein each member from the set of unique members has associated with it a plurality of identifiers for uniquely identifying the member. The method comprising the following steps. First, a set of combinations of identifiers is obtained by, for each record, determining a plurality of identifiers using the data stored in the record. Next, a set of clusters of linked combinations of identifiers is created, by creating a link between any combinations that have equal identifiers. For each cluster from the set of clusters, a quality value for the cluster is calculated, and any cluster whose quality value is below a pre-determined threshold is split into two or more clusters by removing one or more links between combinations in the original cluster, if in that case the resulting clusters have higher quality values than the original cluster. Finally, any records whose corresponding combinations of identifiers are members of the same cluster are linked.

    摘要翻译: 一种链接电子数据库记录的方法,其中每个记录与一组唯一成员与单个成员相关联,并且其中来自该组唯一成员的每个成员已经与其相关联用于唯一地标识该成员的多个标识符。 该方法包括以下步骤。 首先,通过对于每个记录,使用存储在记录中的数据确定多个标识符来获得一组标识符的组合。 接下来,通过在具有相同标识符的任何组合之间创建链接来创建标识符的链接组合的集合。 对于来自该组集合的每个集群,计算集群的质量值,并且其质量值低于预定阈值的任何集群通过去除原始集群中的组合之间的一个或多个链路而被分割成两个或更多个集群 如果在这种情况下,所得到的聚类具有比原始聚类更高的质量值。 最后,将相应组合的标识符组成的任何记录链接在一起。