专利检索 ap:("Mihaela Ancuta Bornea" OR "Songyun Duan" OR "Achille Belly Fokoue-Nkoutche" OR "Oktie Hassanzadeh" OR "Anastasios Kementsietsidis" OR "Kavitha Srinivas" OR "Michael James Ward") AND inv:"Mihaela Ancuta Bornea" 第 1 页

1.

发明授权
Linking data elements based on similarity data values and semantic annotations 有权

公开(公告)号：US10229200B2

公开(公告)日：2019-03-12

申请号：US13491724

申请日：2012-06-08

申请人： Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael James Ward

发明人： Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael James Ward

IPC分类号： G06F17/30

摘要： Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.

2.

发明申请
Linking Data Elements Based on Similarity Data Values and Semantic Annotations 审中-公开

公开(公告)号：US20130332467A1

公开(公告)日：2013-12-12

申请号：US13543872

申请日：2012-07-08

申请人： Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

发明人： Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

IPC分类号： G06F17/30

CPC分类号： G06F16/951

摘要： Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.

3.

发明申请
Linking Data Elements Based on Similarity Data Values and Semantic Annotations 审中-公开
标题翻译：基于相似性数据值和语义注释链接数据元素

公开(公告)号：US20130332466A1

公开(公告)日：2013-12-12

申请号：US13491724

申请日：2012-06-08

申请人： Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

发明人： Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

IPC分类号： G06F17/30

CPC分类号： G06F17/30864

摘要： Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.

摘要翻译： 来自数据源并且具有数据值集合的数据元素通过使用散列函数来链接，以基于与该数据元素相关联的所有数据值来确定每个数据元素的尺寸上减小的实例签名，以产生多个等距固定的尺寸缩小的实例签名大小，使得在多个实例签名之间保持跨所有数据元素的数据值中的数据值之间的相似性。使用位置敏感哈希函数中的多个实例签名来识别要链接的候选数据元素对，并且使用预定的相似度测量为每个候选对生成相似性索引。具有高于给定阈值的相似性指数的候选对的数据元素被链接。

4.

发明授权
Querying and integrating structured and unstructured data 有权
标题翻译：查询和整合结构化和非结构化数据

公开(公告)号：US09037615B2

公开(公告)日：2015-05-19

申请号：US13493174

申请日：2012-06-11

申请人： Mihaela Ancuta Bornea , Songyun Duan , James J. Fan , Achille Fokoue-Nkoutche , Alfio M. Gliozzo , Aditya Kalyanpur , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

发明人： Mihaela Ancuta Bornea , Songyun Duan , James J. Fan , Achille Fokoue-Nkoutche , Alfio M. Gliozzo , Aditya Kalyanpur , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30946 , G06F17/30292

摘要： A computer-implemented method, system, and article of manufacture for querying and integrating structured and unstructured data. The method includes: receiving entity information that is extracted from a first set of unstructured data using an open domain information extraction system, wherein the entity in-formation comprises relationship information between a first entity and a second entity of the first set of unstructured data; recognizing a pattern based on the relationship information and creating a schema for the first set of unstructured data based on the pattern; and associating an element of the created schema with (i) an entity of a second set of unstructured data or (ii) a schema element of an existing set of structured data if there is sufficient overall similarity between the created schema element and either the second unstructured data entity or the schema element of the existing structured data.

摘要翻译： 用于查询和整合结构化和非结构化数据的计算机实现的方法，系统和制造。该方法包括：使用开放域信息提取系统接收从第一组非结构化数据提取的实体信息，其中所述实体信息包括第一组非结构化数据的第一实体与第二实体之间的关系信息; 基于所述关系信息识别模式，并基于所述模式为所述第一组非结构化数据创建模式; 并且将所创建的模式的元素与（i）第二组非结构化数据的实体相关联，或者（ii）现有结构化数据集合的模式元素，如果所创建的模式元素与第二组之间存在足够的总体相似度非结构化数据实体或现有结构化数据的架构元素。

5.

发明申请
QUERYING AND INTEGRATING STRUCTURED AND INSTRUCTURED DATA 有权
标题翻译：查询和整合结构化和结构化数据

公开(公告)号：US20130332478A1

公开(公告)日：2013-12-12

申请号：US13493174

申请日：2012-06-11

申请人： Mihaela Ancuta Bornea , Songyun Duan , James J. Fan , Achille Fokoue-Nkoutche , Alfio M. Gliozzo , Aditya Kalyanpur , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

发明人： Mihaela Ancuta Bornea , Songyun Duan , James J. Fan , Achille Fokoue-Nkoutche , Alfio M. Gliozzo , Aditya Kalyanpur , Anastasios Kementsietsidis , Kavitha Srinivas , Michael J. Ward

IPC分类号： G06F17/30

CPC分类号： G06F17/30946 , G06F17/30292

摘要： A computer-implemented method, system, and article of manufacture for querying and integrating structured and unstructured data. The method includes: receiving entity information that is extracted from a first set of unstructured data using an open domain information extraction system, wherein the entity information comprises relationship information between a first entity and a second entity of the first set of unstructured data; recognizing a pattern based on the relationship information and creating a schema for the first set of unstructured data based on the pattern; and associating an element of the created schema with (i) an entity of a second set of unstructured data or (ii) a schema element of an existing set of structured data if there is sufficient overall similarity between the created schema element and either the second unstructured data entity or the schema element of the existing structured data.

摘要翻译： 用于查询和整合结构化和非结构化数据的计算机实现的方法，系统和制造。该方法包括：使用开放域信息提取系统接收从第一组非结构化数据提取的实体信息，其中实体信息包括第一组非结构化数据的第一实体与第二实体之间的关系信息; 基于所述关系信息识别模式，并基于所述模式为所述第一组非结构化数据创建模式; 并且将所创建的模式的元素与（i）第二组非结构化数据的实体相关联，或者（ii）现有结构化数据集合的模式元素，如果所创建的模式元素与第二组之间存在足够的总体相似度非结构化数据实体或现有结构化数据的架构元素。

6.

发明授权
Optimizing sparse schema-less data in relational stores 有权
标题翻译：优化关系存储中的稀疏无模式数据

公开(公告)号：US08918434B2

公开(公告)日：2014-12-23

申请号：US13454559

申请日：2012-04-24

申请人： Bishwaranjan Bhattacharjee , Mihaela Ancuta Bornea , Patrick Dantressangle , Julian Dolby , Kavitha Srinivas , Octavian Udrea

发明人： Bishwaranjan Bhattacharjee , Mihaela Ancuta Bornea , Patrick Dantressangle , Julian Dolby , Kavitha Srinivas , Octavian Udrea

IPC分类号： G06F17/30 , G06F7/00

CPC分类号： G06F17/30292

摘要： Various embodiments of the invention relate to optimizing storage of schema-less data. A schema-less dataset including a plurality of resources is received. Each resource is associated with at least a plurality of properties. At least one set of co-occurring properties from the plurality of properties is identified. A graph including a plurality of nodes is generated. Each of the nodes represents a unique property in the set of co-occurring properties. The graph further includes an edge connecting each node representing a pair of co-occurring properties. A graph coloring operation is performed on the graph. The graph coloring operation includes assigning each of nodes to a color, where nodes connected by an edge are assigned different colors. A schema is generated that assigns a column identifier from a table to each unique property represented by one of the nodes in the graph based on the color assigned to the node.

摘要翻译： 本发明的各种实施例涉及优化无模式数据的存储。接收包括多个资源的无模式数据集。每个资源与至少多个属性相关联。识别来自多个属性的至少一组共同属性。生成包括多个节点的图形。每个节点表示共同出现属性集中的唯一属性。该图还包括连接表示一对共同属性的每个节点的边缘。在图表上执行图形着色操作。图形着色操作包括将每个节点分配给颜色，其中通过边缘连接的节点被分配不同的颜色。生成一种模式，该模式根据分配给该节点的颜色，将表中的列标识符从图中的一个节点分配给每个唯一属性。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类