Differentially private linear queries on histograms

    公开(公告)号:US09672364B2

    公开(公告)日:2017-06-06

    申请号:US13831948

    申请日:2013-03-15

    摘要: The privacy of linear queries on histograms is protected. A database containing private data is queried. Base decomposition is performed to recursively compute an orthonormal basis for the database space. Using correlated (or Gaussian) noise and/or least squares estimation, an answer having differential privacy is generated and provided in response to the query. In some implementations, the differential privacy is ε-differential privacy (pure differential privacy) or is (ε,δ)-differential privacy (i.e., approximate differential privacy). In some implementations, the data in the database may be dense. Such implementations may use correlated noise without using least squares estimation. In other implementations, the data in the database may be sparse. Such implementations may use least squares estimation with or without using correlated noise.

    SCALABLE, SCHEMALESS DOCUMENT QUERY MODEL
    2.
    发明申请
    SCALABLE, SCHEMALESS DOCUMENT QUERY MODEL 有权
    可扩展的,计划文件查询模型

    公开(公告)号:US20140280047A1

    公开(公告)日:2014-09-18

    申请号:US13828229

    申请日:2013-03-14

    IPC分类号: G06F17/30

    摘要: Query models for document sets (such as XML documents or records in a relational database) typically involve a schema defining the structure of the documents. However, rigidly defined schemas often raise difficulties with document validation with even inconsequential structural variations. Additionally, queries developed against schema-constrained documents are often sensitive to structural details and variations that are not inconsequential to the query, resulting in inaccurate results and development complications, and that may break upon schema changes. Instead, query models for hierarchically structured documents that enable “twig” queries specifying only the structural details of document nodes that are relevant to the query (e.g., students in a student database having a sibling named “Lee” and a teacher named “Smith,” irrespective of unrelated structural details of the document). Such “twig” query models may enable a more natural query development, and continued accuracy of queries in the event of unrelated schema variations and changes.

    摘要翻译: 文档集(如关系数据库中的XML文档或记录)的查询模型通常涉及定义文档结构的模式。 然而,刚性定义的模式通常会导致文档验证的困难,甚至无关紧要的结构变化。 另外,针对模式约束的文档开发的查询通常对结构细节和对查询不重要的变体敏感,导致不准确的结果和开发复杂性,并且可能会破坏模式更改。 相反,用于分层结构化文档的查询模型,使得“twig”查询仅指定与查询相关的文档节点的结构细节(例如,具有名为“Lee”的兄弟姐妹的学生数据库中的学生和名为“Smith”的教师, “不管文件的不相关的结构细节如何)。 这种“twig”查询模型可以实现更自然的查询开发,以及在不相关的模式变化和变化的情况下,查询的持续准确性。

    Scalable, schemaless document query model
    3.
    发明授权
    Scalable, schemaless document query model 有权
    可扩展的,无模式的文档查询模型

    公开(公告)号:US09230040B2

    公开(公告)日:2016-01-05

    申请号:US13828229

    申请日:2013-03-14

    IPC分类号: G06F7/00 G06F17/30

    摘要: Query models for document sets (such as XML documents or records in a relational database) typically involve a schema defining the structure of the documents. However, rigidly defined schemas often raise difficulties with document validation with even inconsequential structural variations. Additionally, queries developed against schema-constrained documents are often sensitive to structural details and variations that are not inconsequential to the query, resulting in inaccurate results and development complications, and that may break upon schema changes. Instead, query models for hierarchically structured documents that enable “twig” queries specifying only the structural details of document nodes that are relevant to the query (e.g., students in a student database having a sibling named “Lee” and a teacher named “Smith,” irrespective of unrelated structural details of the document). Such “twig” query models may enable a more natural query development, and continued accuracy of queries in the event of unrelated schema variations and changes.

    摘要翻译: 文档集(如关系数据库中的XML文档或记录)的查询模型通常涉及定义文档结构的模式。 然而,刚性定义的模式通常会导致文档验证的困难,甚至无关紧要的结构变化。 另外,针对模式约束的文档开发的查询通常对结构细节和对查询不重要的变体敏感,导致不准确的结果和开发复杂性,并且可能会破坏模式更改。 相反,用于分层结构化文档的查询模型,使得“twig”查询仅指定与查询相关的文档节点的结构细节(例如,具有名为“Lee”的兄弟姐妹的学生数据库中的学生和名为“Smith”的教师, “不管文件的不相关的结构细节如何)。 这种“twig”查询模型可以实现更自然的查询开发,以及在不相关的模式变化和变化的情况下,查询的持续准确性。

    DIFFERENTIALLY PRIVATE LINEAR QUERIES ON HISTOGRAMS
    4.
    发明申请
    DIFFERENTIALLY PRIVATE LINEAR QUERIES ON HISTOGRAMS 有权
    对组织学的不确定性进行线性查询

    公开(公告)号:US20140283091A1

    公开(公告)日:2014-09-18

    申请号:US13831948

    申请日:2013-03-15

    IPC分类号: G06F21/60 G06F17/30

    摘要: The privacy of linear queries on histograms is protected. A database containing private data is queried. Base decomposition is performed to recursively compute an orthonormal basis for the database space. Using correlated (or Gaussian) noise and/or least squares estimation, an answer having differential privacy is generated and provided in response to the query. In some implementations, the differential privacy is ε-differential privacy (pure differential privacy) or is (ε,δ)-differential privacy (i.e., approximate differential privacy). In some implementations, the data in the database may be dense. Such implementations may use correlated noise without using least squares estimation. In other implementations, the data in the database may be sparse. Such implementations may use least squares estimation with or without using correlated noise.

    摘要翻译: 对直方图的线性查询的隐私受到保护。 查询包含私有数据的数据库。 执行基本分解以递归地计算数据库空间的正交基准。 使用相关(或高斯)噪声和/或最小二乘估计,响应于查询生成并提供具有差分隐私的答案。 在一些实现中,差分隐私是“微分隐私”(纯差分隐私)或者是(&egr;,δ) - 差异隐私(即近似差异隐私)。 在一些实现中,数据库中的数据可能是密集的。 这样的实现可以使用相关噪声而不使用最小二乘估计。 在其他实现中,数据库中的数据可能是稀疏的。 这样的实现可以使用或不使用相关噪声来进行最小二乘估计。

    Query and index over documents
    5.
    发明授权
    Query and index over documents 有权
    查询和索引文档

    公开(公告)号:US09208254B2

    公开(公告)日:2015-12-08

    申请号:US13709064

    申请日:2012-12-10

    IPC分类号: G06F17/30 G06F17/22

    摘要: A document index is generated from a set of documents and is used to identify documents that match one or more queries. A tree is generated for each document with a node corresponding to each object of the document. The nodes of the generated trees are merged or combined to generate the document index, which is itself a tree. In addition, an inverted index is generated for each node of the index that identifies the tree(s) that the node originated from. When a query is received, the query is first executed against the document index tree: during the execution, proper set operations are applied to the inverted indices associated with the nodes matched by the query. The resulted set identifies the documents that may match the query. The query is then executed on the identified documents.

    摘要翻译: 从一组文档生成文档索引,并用于标识与一个或多个查询匹配的文档。 为每个文档生成一个与文档的每个对象对应的节点的树。 生成的树的节点被合并或组合以生成本身是树的文档索引。 此外,为标识节点源自的树的索引的每个节点生成反向索引。 当接收到查询时,首先对文档索引树执行查询:在执行过程中,将适当的集合操作应用于与查询匹配的节点相关联的反向索引。 结果集确定可能匹配查询的文档。 然后在所识别的文档上执行查询。

    QUERY AND INDEX OVER DOCUMENTS
    6.
    发明申请
    QUERY AND INDEX OVER DOCUMENTS 有权
    查询和索引超过文件

    公开(公告)号:US20140164388A1

    公开(公告)日:2014-06-12

    申请号:US13709064

    申请日:2012-12-10

    IPC分类号: G06F17/30

    摘要: A document index is generated from a set of documents and is used to identify documents that match one or more queries. A tree is generated for each document with a node corresponding to each object of the document. The nodes of the generated trees are merged or combined to generate the document index, which is itself a tree. In addition, an inverted index is generated for each node of the index that identifies the tree(s) that the node originated from. When a query is received, the query is first executed against the document index tree: during the execution, proper set operations are applied to the inverted indices associated with the nodes matched by the query. The resulted set identifies the documents that may match the query. The query is then executed on the identified documents.

    摘要翻译: 从一组文档生成文档索引,并用于标识与一个或多个查询匹配的文档。 为每个文档生成一个与文档的每个对象对应的节点的树。 生成的树的节点被合并或组合以生成本身是树的文档索引。 此外,为标识节点源自的树的索引的每个节点生成反向索引。 当接收到查询时,首先对文档索引树执行查询:在执行过程中,将适当的集合操作应用于与查询匹配的节点相关联的反向索引。 结果集确定可能匹配查询的文档。 然后在所识别的文档上执行查询。