DISCOVERY OF LINKAGE POINTS BETWEEN DATA SOURCES

    公开(公告)号:US20200183995A1

    公开(公告)日:2020-06-11

    申请号:US16794895

    申请日:2020-02-19

    摘要: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.

    Linkage Prediction Through Similarity Analysis

    公开(公告)号:US20180053096A1

    公开(公告)日:2018-02-22

    申请号:US15242821

    申请日:2016-08-22

    IPC分类号: G06N5/02 G06F17/30

    CPC分类号: G06N5/022 G06F16/9024

    摘要: Methods, systems, and computer program products for linkage prediction through similarity analysis are provided herein. A computer-implemented method includes extracting multiple features from (i) one or more attributes of a set of source nodes within a knowledge graph and (ii) one or more attributes of a set of target nodes within the knowledge graph, wherein at least one extracted feature satisfies a designated complexity level; performing a similarity analysis across the at least one extracted feature by applying one or more similarity measures to the at least one extracted feature; predicting one or more sets of links between the source nodes and the target nodes based on the similarity analysis, wherein one or more sets of predicted links satisfy a pre-determined accuracy threshold; and outputting the one or more sets of predicted links to a user.

    PREDICTION OF ADVERSE DRUG EVENTS
    25.
    发明申请

    公开(公告)号:US20170116390A1

    公开(公告)日:2017-04-27

    申请号:US14953590

    申请日:2015-11-30

    IPC分类号: G06F19/00

    CPC分类号: G06F19/326 G06F19/3456

    摘要: Embodiments include method, systems and computer program products for predicting adverse drug events on a computational system. Aspects include receiving known drug data from drug databases and one or more of a candidate drug, a drug pair, and a candidate drug-patient pair. Aspects also include calculating an adverse event prediction rating representing a confidence level of an adverse drug event for the candidate drug, a drug pair, and a candidate drug-patient pair, the rating being based on the known drug data. Aspects also include associating adverse event features with the candidate drug, drug pair, or a candidate drug-patient pair, including a nature, cause, mechanism, or severity of the adverse drug event. Aspects also include calculating and outputting an adverse event prediction rating.

    Uniform search, navigation and combination of heterogeneous data
    26.
    发明授权
    Uniform search, navigation and combination of heterogeneous data 有权
    统一搜索,导航和异构数据的组合

    公开(公告)号:US09569506B2

    公开(公告)日:2017-02-14

    申请号:US14960696

    申请日:2015-12-07

    IPC分类号: G06F17/30

    摘要: A unified interface that abstracts the underlying differences among heterogeneous data sources and data formats to produce uniform search results. While the result of an initial search may be exactly what the user was seeking, it is likely that the result is in the neighborhood of what was sought. It may aid the end user to provide guided data navigation suggestions to locate related data during data exploration, by providing analysis to identify data similarities among disparate data sources, and by providing guided combination options. The guided data navigation suggestions may include suggestions based on schematic, semantic, and social information. Guided data navigation may aid the user in moving from the initial search landing point in the data to the precise result sought.

    摘要翻译: 一个统一的界面,用于提取异构数据源和数据格式之间的潜在差异,以产生统一的搜索结果。 虽然初次搜索的结果可能正是用户正在寻找的结果,但结果可能在所寻求的附近。 通过提供分析来识别不同数据源之间的数据相似性,并提供指导的组合选项,可能有助于最终用户提供指导数据导航建议,以在数据搜索期间定位相关数据。 引导数据导航建议可能包括基于原理图,语义和社会信息的建议。 引导数据导航可以帮助用户从数据中的初始搜索着陆点移动到所寻求的精确结果。

    Producing Clustered Top-K Plans
    27.
    发明申请
    Producing Clustered Top-K Plans 有权
    制作集群Top-K计划

    公开(公告)号:US20160321544A1

    公开(公告)日:2016-11-03

    申请号:US14525790

    申请日:2014-10-28

    IPC分类号: G06N5/02

    摘要: A mechanism is provided for identifying a set of top-m clusters from a set of top-k plans. A planning problem and an integer value k indicating a number of top plans to be identified are received. A set of top-k plans are generated with at most size k, where the set of top-k plans is with respect to a given measure of plan quality. Each plan in the set of top-k plans is clustered based on a similarity between plans such that each cluster contains similar plans and each plan is grouped only into one cluster thereby forming the set of top-m clusters. A representative plan from each top-m cluster is presented to the user.

    摘要翻译: 提供了一种用于从一组top-k计划中识别一组顶m个簇的机制。 接收到规划问题和指示要识别的顶部计划的数量的整数值k。 产生一组顶级k计划,其最大尺寸为k,其中top-k计划的集合相对于给定的计划质量度量。 基于计划之间的相似性,顶部k计划中的每个计划被聚类,使得每个集群包含相似的计划,并且每个计划仅被分组成一个集群,从而形成一组顶部m个集群。 向用户呈现每个top-m群集的代表性计划。

    Producing Clustered Top-K Plans
    28.
    发明申请
    Producing Clustered Top-K Plans 有权
    制作集群Top-K计划

    公开(公告)号:US20160117602A1

    公开(公告)日:2016-04-28

    申请号:US14745899

    申请日:2015-06-22

    IPC分类号: G06N99/00 G06F17/30

    摘要: A mechanism is provided for identifying a set of top-in clusters from a set of top-k plans. A planning problem and an integer value k indicating a number of top plans to be identified are received. A set of top-k plans are generated with at most size k, where the set of top-k plans is with respect to a given measure of plan quality. Each plan in the set of top-k plans is clustered based on a similarity between plans such that each cluster contains similar plans and each plan is grouped only into one cluster thereby forming the set of top-m clusters. A representative plan from each top-m cluster is presented to the user.

    摘要翻译: 提供了一种用于从一组top-k计划中识别一组顶级聚类的机制。 接收到规划问题和指示要识别的顶部计划的数量的整数值k。 产生一组顶级k计划,其最大尺寸为k,其中top-k计划的集合相对于给定的计划质量度量。 基于计划之间的相似性,顶部k计划中的每个计划被聚类,使得每个集群包含相似的计划,并且每个计划仅被分组成一个集群,从而形成一组顶部m个集群。 向用户呈现每个top-m群集的代表性计划。

    Uniform search, navigation and combination of heterogeneous data
    29.
    发明授权
    Uniform search, navigation and combination of heterogeneous data 有权
    统一搜索,导航和异构数据的组合

    公开(公告)号:US09244991B2

    公开(公告)日:2016-01-26

    申请号:US13968486

    申请日:2013-08-16

    IPC分类号: G06F17/30

    摘要: A method and system for interfacing with an end user to search, navigate, and combine large numbers of heterogeneous data sources with varying data characteristics. End user entered search terms are received and the end user is then presented a guided exploration including search results and search result details. The end user is also presented with a guided combination including search result combination options and combination details. Both the guided exploration and guided combination render all data from the heterogeneous data sources in a uniform data format and both can culminate in saving selected results.

    摘要翻译: 用于与最终用户进行接口以搜索,导航和组合具有变化的数据特征的大量异构数据源的方法和系统。 接收最终用户输入的搜索项,然后向最终用户呈现包括搜索结果和搜索结果详细信息的指导性探索。 最终用户还可以看到引导组合,包括搜索结果组合选项和组合细节。 引导式探索和引导组合都能以统一的数据格式显示来自异构数据源的所有数据,两者都可以最终节省选定的结果。