HIGH PERFORMANCE PARALLEL INDEXING FOR FORENSICS AND ELECTRONIC DISCOVERY

    公开(公告)号:US20180137115A1

    公开(公告)日:2018-05-17

    申请号:US15351733

    申请日:2016-11-15

    Applicant: SAP SE

    CPC classification number: G06F16/313 G06F16/325

    Abstract: A system, a method and a computer program product for indexing data samples are disclosed. A locality-sensitive string hash index is determined for each data sample in a plurality of data samples. The determined locality-sensitive string hash indexes for at least two data samples in the plurality of data samples are compared. The comparison includes estimating, based on the determined locality-sensitive string hash indexes, a distance between the two data samples. Based on the comparison, at least one data sample in the plurality of data samples being similar to at least another data sample in the plurality of data samples is identified.

    High performance parallel indexing for forensics and electronic discovery

    公开(公告)号:US10417265B2

    公开(公告)日:2019-09-17

    申请号:US15351733

    申请日:2016-11-15

    Applicant: SAP SE

    Abstract: A system, a method and a computer program product for indexing data samples are disclosed. A locality-sensitive string hash index is determined for each data sample in a plurality of data samples. The determined locality-sensitive string hash indexes for at least two data samples in the plurality of data samples are compared. The comparison includes estimating, based on the determined locality-sensitive string hash indexes, a distance between the two data samples. Based on the comparison, at least one data sample in the plurality of data samples being similar to at least another data sample in the plurality of data samples is identified.

Patent Agency Ranking