Data hashing method, data processing method, and data processing system using similarity-based hashing algorithm
    1.
    发明申请
    Data hashing method, data processing method, and data processing system using similarity-based hashing algorithm 有权
    数据散列方法,数据处理方法和使用基于相似度散列算法的数据处理系统

    公开(公告)号:US20070130188A1

    公开(公告)日:2007-06-07

    申请号:US11634731

    申请日:2006-12-06

    IPC分类号: G06F7/00

    摘要: Provided are a data hashing method, a data processing method, and a data processing system using a similarity-based hashing (SBH) algorithm in which the same hash value is calculated for the same data and the more similar data, the smaller difference in the generated hash values. The data hashing method includes receiving computerized data, and generating a hash value of the computerized data using the SBH algorithm in which two data are the same if calculated hash values are the same and two data are similar if the difference of calculated hash values is small. Therefore, a search, comparison, and classification of data can be quickly processed within a time complexity of O(1) or O(n) since the similarity/closeness of data content are quantified by that of the corresponding hash values.

    摘要翻译: 提供了一种使用基于相似度的散列(SBH)算法的数据散列方法,数据处理方法和数据处理系统,其中针对相同数据计算相同的散列值,并且提供了更相似的数据, 生成的哈希值。 数据散列方法包括接收计算机数据,并使用SBH算法生成计算机化数据的哈希值,其中如果计算的散列值相同,则两个数据相同,并且如果计算的散列值的差异小则两个数据相似 。 因此,可以在O(1)或O(n)的时间复杂度内快速地处理数据的搜索,比较和分类,因为数据内容的相似/接近由相应散列值的相似度/接近度量化。