AUTOMATED DATABASE SCHEMA ANNOTATION
    7.
    发明公开
    AUTOMATED DATABASE SCHEMA ANNOTATION 审中-公开
    自动数据库模式注释

    公开(公告)号:EP3311305A1

    公开(公告)日:2018-04-25

    申请号:EP16729469.3

    申请日:2016-06-06

    IPC分类号: G06F17/30 G06F17/24

    摘要: Techniques and constructs that improve annotating target columns of a target database by performing automated annotation of the target columns using sources. The techniques include calculating a similarity score between a target column and columns extracted from a table that is included in a source. The similarity score is calculated based at least in part on a similarity between a value in the target column of the target database and a column value of the extracted column from the table and on a similarity between an identity of the target column of the target database and column identities of the extracted columns from the table. In some examples, the techniques calculate similarity scores for one or more extracted columns and annotate the target column based on the similarity scores.

    JOINING SEMANTICALLY-RELATED DATA USING BIG TABLE CORPORA
    10.
    发明公开
    JOINING SEMANTICALLY-RELATED DATA USING BIG TABLE CORPORA 有权
    使用BIG TABLE CORPORA加入与语义相关的数据

    公开(公告)号:EP3304347A1

    公开(公告)日:2018-04-11

    申请号:EP16728148.4

    申请日:2016-05-18

    IPC分类号: G06F17/30

    摘要: Examples of the disclosure enable performing semantic joins using a big table corpus. Pairs of values from at least two data sets are identified. The pairs of values include one value from a first one of the data sets and one value from a second one of the data sets. Statistical co-occurrence scores for the identified pairs of values are determined based on historical co-occurrence data. The determined statistical co-occurrence scores are used for predicting a semantic relationship between the at least two data sets. The predicted semantic relationship is used for joining the at least two data sets.