-
1.
公开(公告)号:US20230359615A1
公开(公告)日:2023-11-09
申请号:US17740077
申请日:2022-05-09
Applicant: data.world, Inc.
Inventor: Shad William Reynolds , David Lee Griffith , Bryon Kristen Jacob
IPC: G06F16/242 , G06F16/25 , G06F16/2455 , G06F16/248
CPC classification number: G06F16/2445 , G06F16/24553 , G06F16/248 , G06F16/252
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and network communications to interface among repositories of disparate datasets and computing machine-based entities configured to access datasets, and, more specifically, to a computing and data storage platform configured to provide one or more computerized tools to deploy predictive data models based on in-situ auxiliary query commands implemented in a query, and configured to facilitate development and management of data projects by providing an interactive, project-centric workspace interface coupled to collaborative computing devices and user accounts. For example, a method may include activating a query engine, implementing a subset of auxiliary instructions, at least one auxiliary instruction being configured to access model data, receiving a query that causes the query engine to access the model data, receiving serialized model data, performing a function associated with the serialized model data, and generating resultant data.
-
公开(公告)号:US11669540B2
公开(公告)日:2023-06-06
申请号:US17004570
申请日:2020-08-27
Applicant: data.world, Inc.
Inventor: David Lee Griffith
IPC: G06F7/00 , G06F16/25 , G06F16/901 , G06F16/22
CPC classification number: G06F16/258 , G06F16/221 , G06F16/9024
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to interface among repositories of disparate datasets and computing machine-based entities configured to access datasets, and, more specifically, to a computing and data storage platform to identify and match equivalent subsets of data between an ingested dataset, such as in a tabular data arrangement, and one or more graph-based data arrangements, according to at least some examples. For example, a method may include identifying a tabular data arrangement including a subset of data as a column, computing a compressed data representation for a column of data, correlating a compressed data representation to a reference compressed data representations, detecting a link between a column of data associated with a correlated compressed data representation to a dataset stored in a graph data arrangement, and forming an expanded tabular data arrangement.
-
公开(公告)号:US20230153312A1
公开(公告)日:2023-05-18
申请号:US17893100
申请日:2022-08-22
Applicant: data.world, Inc.
Inventor: Bryon Kristen Jacob , Jon Loyens , David Lee Griffith , Brett A. Hurt , Triet Minh Le , Shad William Reynolds , Arthur Albert Keen , Joseph Boutros , Alexander John Zelenak
IPC: G06F16/2458 , G06F16/25 , G06N5/04 , G06F16/215
CPC classification number: G06F16/2471 , G06F16/256 , G06F16/2465 , G06N5/04 , G06F16/252 , G06F16/215 , G06F16/258
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby a collaborative data layer and associated logic facilitate, for example, efficient access to, and implementation of, collaborative datasets. In some examples, a method may include receiving data representing a query into a collaborative dataset consolidation system, identifying datasets relevant to the query, generating one or more queries to access disparate data repositories, and retrieving data representing query results. In some cases, one or more queries are applied (e.g., as a federated query) to atomized datasets stored in one or more atomized data stores, at least two of which may be different.
-
公开(公告)号:US11573948B2
公开(公告)日:2023-02-07
申请号:US17163287
申请日:2021-01-29
Applicant: data.world, Inc.
Inventor: David Lee Griffith
IPC: G06F16/23 , G06F16/28 , G06F16/901
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to interface among repositories of disparate datasets and computing machine-based entities configured to access datasets, and, more specifically, to a computing and data storage platform to implement predict data constraints to validate one or more portions of a dataset, according to at least some examples. For example, a method may include predicting a subset of constraint data to validate a graph-based data arrangement, and analyzing the graph-based data arrangement against a subset of constraint data to determine an action. At least one action may include validating data in a graph-based data arrangement. Also, the method may include integrating graph-based data arrangement into a graph data arrangement responsive to determining data representing a validation.
-
公开(公告)号:US20230009198A1
公开(公告)日:2023-01-12
申请号:US17728825
申请日:2022-04-25
Applicant: data.world, Inc.
Inventor: Bryon Kristen Jacob , David Lee Griffith , Triet Minh Le , John Loyens , Brett A. Hurt , Arthur Albert Keen
IPC: G06F16/25 , G06F16/245 , G06F16/28
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby a collaborative data layer and associated logic facilitate, for example, efficient access to, and implementation of, collaborative datasets. In some examples, a method may include receiving data representing a query of a consolidated dataset that may include datasets formatted atomized datasets, analyzing the query to classify portions of the query to form classified query portions, partitioning the query into sub-queries as a function of a classification type for each of the classified query portions, and retrieving data representing a query result from distributed data repositories.
-
公开(公告)号:US11327996B2
公开(公告)日:2022-05-10
申请号:US16917228
申请日:2020-06-30
Applicant: data.world, Inc.
Inventor: Shad William Reynolds , David Lee Griffith , Jon Loyens , Bryon Kristen Jacob
IPC: G06F16/25 , G06F21/62 , G06F3/0484 , G06F40/169 , G06F16/215 , G06F16/906 , G06F40/177 , G06F3/04842
Abstract: Various embodiments relate generally to data science and data analysis, and computer software and systems, to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby user interfaces may be implemented as computerized tools for presenting summarization of dataset attributes to facilitate discovery, formation, and analysis of interrelated collaborative datasets. In some examples, a method may include receiving data resulting from insight calculations. Insight calculations may be based on a derived dataset attribute. Also, the method may include presenting a data arrangement overview summarizing the data attributes as an aggregation of data attributes in a portion of the user interface. The data arrangement overview may include an interactive display of a distribution associated with a collaborative atomized dataset.
-
公开(公告)号:US11327991B2
公开(公告)日:2022-05-10
申请号:US16899549
申请日:2020-06-11
Applicant: data.world, Inc.
Inventor: Shad William Reynolds , David Lee Griffith , Bryon Kristen Jacob
IPC: G06F16/25 , G06N5/02 , G06F16/2455 , G06F16/248
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and network communications to interface among repositories of disparate datasets and computing machine-based entities configured to access datasets, and, more specifically, to a computing and data storage platform configured to provide one or more computerized tools to deploy predictive data models based on in-situ auxiliary query commands implemented in a query, and configured to facilitate development and management of data projects by providing an interactive, project-centric workspace interface coupled to collaborative computing devices and user accounts. For example, a method may include activating a query engine, implementing a subset of auxiliary instructions, at least one auxiliary instruction being configured to access model data, receiving a query that causes the query engine to access the model data, receiving serialized model data, performing a function associated with the serialized model data, and generating resultant data.
-
公开(公告)号:US20210224250A1
公开(公告)日:2021-07-22
申请号:US17163287
申请日:2021-01-29
Applicant: Data.World, Inc.
Inventor: David Lee Griffith
IPC: G06F16/23 , G06F16/28 , G06F16/901
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to interface among repositories of disparate datasets and computing machine-based entities configured to access datasets, and, more specifically, to a computing and data storage platform to implement predict data constraints to validate one or more portions of a dataset, according to at least some examples. For example, a method may include predicting a subset of constraint data to validate a graph-based data arrangement, and analyzing the graph-based data arrangement against a subset of constraint data to determine an action. At least one action may include validating data in a graph-based data arrangement. Also, the method may include integrating graph-based data arrangement into a graph data arrangement responsive to determining data representing a validation.
-
公开(公告)号:US20200218723A1
公开(公告)日:2020-07-09
申请号:US16457759
申请日:2019-06-28
Applicant: data.world, Inc.
Inventor: Bryon Kristen Jacob , David Lee Griffith , Triet Minh Le , Shad William Reynolds , Arthur Albert Keen
IPC: G06F16/2453 , G06F16/901 , G06F16/21
Abstract: Various techniques are described for platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization, including receiving at a dataset access platform a query formatted according to a first data schema, generating a copy of the query, saving the query and the copy to a datastore, parsing the copy of the query in the first schema using an inference engine, determining whether the query comprises data associated with an access control condition associated with accessing the dataset, the access control condition being configured to indicate whether the query is permitted to access the dataset, and rewriting, using a proxy server, the copy of the query in a second schema, and optimizing the rewriting by identifying a database engine to execute the query and including other data converted into another triple associated with an attribute of the query.
-
10.
公开(公告)号:US20200034371A1
公开(公告)日:2020-01-30
申请号:US16457766
申请日:2019-06-28
Applicant: data.world, Inc.
Inventor: Shad William Reynolds , Bryon Kristen Jacob , Jon Loyens , David Lee Griffith , Triet Minh Le , Joseph Boutros
IPC: G06F16/25 , G06F9/54 , G06F16/21 , G06F16/248
Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby one or more computerized tools may be configured to discover, form, and analyze, for example, via one or more user interface applications, interrelations among a system of networked collaborative datasets In some examples, a method may include causing transformation of a set of data to an atomized format to form an atomized dataset, monitoring creation of a dataset, and presenting data representing a status of a portion of the creation of the dataset. The status may depict an atomized dataset linked to at least one other dataset.
-
-
-
-
-
-
-
-
-