Aggregation of ancillary data associated with source data in a system of networked collaborative datasets

    公开(公告)号:US11042548B2

    公开(公告)日:2021-06-22

    申请号:US15927006

    申请日:2018-03-20

    摘要: Various embodiments relate generally to data science and data analysis, computer software and systems, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby logic is configured to remediate anomalies in a data set originating in a first format prior to enrichment and conversion into a second format that facilitates forming collaborative dataset and, for example, interrelations among a system of networked collaborative datasets, whereby, at least in some implementations, data interrelations between different formats may be disposed in one or more data layers (e.g., layered data files and/or data arrangements). In some examples, a method may converting a dataset from a data format at a format converter to form an atomized dataset in a graph data arrangement, the atomized dataset being a collaborative dataset including atomized descriptor data and atomized source data.

    EXTENDED COMPUTERIZED QUERY LANGUAGE SYNTAX FOR ANALYZING MULTIPLE TABULAR DATA ARRANGEMENTS IN DATA-DRIVEN COLLABORATIVE PROJECTS

    公开(公告)号:US20190065567A1

    公开(公告)日:2019-02-28

    申请号:US16036834

    申请日:2018-07-16

    IPC分类号: G06F17/30

    摘要: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to interface among repositories of disparate datasets and computing machine-based entities configured to access datasets, and, more specifically, to a computing and data storage platform configured to provide one or more computerized tools that facilitate development and management of data projects, including implementation of extended computerized query language syntax to analyze, for example, multiple tabular data arrangements in data-driven collaborative projects. For example, a method may include generating data to present a query editor in a data project interface, receiving data representing a first query command to select one or more subsets of data, identifying in the data representing a second query command a subset of datasets from which to extract the data, and applying a query based on a first query command and a second query command.

    INTERACTIVE INTERFACES TO PRESENT DATA ARRANGEMENT OVERVIEWS AND SUMMARIZED DATASET ATTRIBUTES FOR COLLABORATIVE DATASETS

    公开(公告)号:US20180210936A1

    公开(公告)日:2018-07-26

    申请号:US15454969

    申请日:2017-03-09

    摘要: Various embodiments relate generally to data science and data analysis, and computer software and systems, to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby user interfaces may be implemented as computerized tools for presenting summarization of dataset attributes to facilitate discovery, formation, and analysis of interrelated collaborative datasets. In some examples, a method may include receiving data resulting from insight calculations. Insight calculations may be based on a derived dataset attribute. Also, the method may include presenting a data arrangement overview summarizing the data attributes as an aggregation of data attributes in a portion of the user interface. The data arrangement overview may include an interactive display of a distribution associated with a collaborative atomized dataset.

    Localized link formation to perform implicitly federated queries using extended computerized query language syntax

    公开(公告)号:US11042556B2

    公开(公告)日:2021-06-22

    申请号:US16036836

    申请日:2018-07-16

    摘要: Various embodiments relate generally to data science and data analysis, computer software and systems, and more specifically, to a computing and data storage platform configured to provide one or more computerized tools that facilitate development and management of data projects, including implementation of localized link identifiers to perform implicitly federated queries using, in some examples, extended computerized query language syntax to analyze multiple tabular data arrangements in data-driven collaborative projects. For example, a method may include importing a dataset into a data project, identifying a remote link identifier associated with a remote data source at which the dataset is stored, transforming the remote dataset identifier to form data representing a link identifier, and presenting in a data project user interface the link identifier as associated with a local namespace associated with the data project to perform implicit query federation using, for example, an extended multi-table syntax.