Deduplication and disambiguation
    1.
    发明授权

    公开(公告)号:US10185738B1

    公开(公告)日:2019-01-22

    申请号:US15253588

    申请日:2016-08-31

    IPC分类号: G06F17/30 G06N7/00

    摘要: Systems and methods for deduplication and disambiguation are disclosed. In example embodiments, a server accesses stored information about a first entity and stored information about a second entity. The server determines, based on the accessed stored information about the first entity and the accessed stored information about the second entity, a set of information items known about both the first entity and the second entity. The server computes, based on the set of information items, a probability that the first entity corresponds to the second entity by computing one or more expressiveness scores corresponding to a value of a first information item and a value of a second information item from the set of information items. The server provides, as a digital transmission, an output representing the computed probability.