-
公开(公告)号:US20180137115A1
公开(公告)日:2018-05-17
申请号:US15351733
申请日:2016-11-15
Applicant: SAP SE
Inventor: Udo Klein , Philipp Scholl
IPC: G06F17/30
CPC classification number: G06F16/313 , G06F16/325
Abstract: A system, a method and a computer program product for indexing data samples are disclosed. A locality-sensitive string hash index is determined for each data sample in a plurality of data samples. The determined locality-sensitive string hash indexes for at least two data samples in the plurality of data samples are compared. The comparison includes estimating, based on the determined locality-sensitive string hash indexes, a distance between the two data samples. Based on the comparison, at least one data sample in the plurality of data samples being similar to at least another data sample in the plurality of data samples is identified.
-
公开(公告)号:US10417265B2
公开(公告)日:2019-09-17
申请号:US15351733
申请日:2016-11-15
Applicant: SAP SE
Inventor: Udo Klein , Philipp Scholl
Abstract: A system, a method and a computer program product for indexing data samples are disclosed. A locality-sensitive string hash index is determined for each data sample in a plurality of data samples. The determined locality-sensitive string hash indexes for at least two data samples in the plurality of data samples are compared. The comparison includes estimating, based on the determined locality-sensitive string hash indexes, a distance between the two data samples. Based on the comparison, at least one data sample in the plurality of data samples being similar to at least another data sample in the plurality of data samples is identified.
-