System, apparatus, and method for user tunable and selectable searching of a database using a weighted quantized feature vector
    1.
    发明授权
    System, apparatus, and method for user tunable and selectable searching of a database using a weighted quantized feature vector 失效
    用于使用加权量化特征向量的用户可调谐和可选择地搜索数据库的系统,装置和方法

    公开(公告)号:US07251643B2

    公开(公告)日:2007-07-31

    申请号:US10516061

    申请日:2004-05-25

    IPC分类号: G06F17/30

    摘要: A data processing means for user tunable and selectable (FIG. 2) of a database wherein the data contained therein have associated descriptive properties (FIG. 2) capable of being expressed in numeric form is described. Descriptive property values (FIG. 2) may be standardized numerically to eliminate property value overweighting. A quantized vector (FIG. 2) representative of the descriptive properties is created for each item in the database. This quantized vector becomes the fingerprint for each data item. The user submits a query item to be matched against the database for similarity. A fingerprint is calculated for the query item. The user may then assign weights to the individual descriptive properties based upon perceived importance (FIG. 2). A newly weighted fingerprint for the query item is then compared with the fingerprints for all the data in the database. A list of results is presented to the user (FIG. 2). The user may then change the previously assigned weights and then re-run the similarity search. This may be done as often as necessary to achieve the desired results. Similarity searching in a generic database is described. However, particulary the method is desirable in databases containing chemical compound structure data or biological response screening result data.

    摘要翻译: 描述了数据库的用户可调整和可选择的数据处理装置(图2),其中其中包含的数据具有能够以数字形式表示的相关联的描述属性(图2)。 描述性属性值(图2)可以数字标准化以消除属性值超重。 为数据库中的每个项目创建代表描述属性的量化向量(图2)。 该量化向量成为每个数据项的指纹。 用户提交要与数据库匹配的查询项以进行相似性。 计算查询项目的指纹。 然后,用户可以基于所感知的重要性为各个描述属性分配权重(图2)。 然后将查询项目的新加权指纹与数据库中所有数据的指纹进行比较。 向用户呈现结果列表(图2)。 然后,用户可以改变先前分配的权重,然后重新运行相似性搜索。 这可以按需要经常进行以实现期望的结果。 描述在通用数据库中的相似性搜索。 然而,特别是在含有化学结构数据或生物反应筛选结果数据的数据库中,该方法是可取的。