Semi-structured data storage and processing functionality to store sparse feature sets

    公开(公告)号:US11461351B1

    公开(公告)日:2022-10-04

    申请号:US17390883

    申请日:2021-07-31

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives raw input data from a source table, the raw input data including data comprising input features for a machine learning model, the raw input data being in a first format including at least multiple rows with each row including multiple columns of values. Based at least in part on the source table, the subject technology generates table metadata corresponding to the source table. Based at least in part on the received raw input data, the subject technology generates column metadata corresponding to values from the source table. The subject technology generates cell data for a feature store table based at least in part on the values from the source table. The subject technology performs at least one database operation to generate the feature store table including at least the generated table metadata, the generated column metadata, and the generated cell data.

    PROCESSING FUNCTIONALITY TO STORE SPARSE FEATURE SETS

    公开(公告)号:US20230409589A1

    公开(公告)日:2023-12-21

    申请号:US18458425

    申请日:2023-08-30

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/25 G06F16/24558 G06F16/86 G06F16/2282

    Abstract: The subject technology generates, by a database system, cell data for a particular table based on values from a source table, the values being based on raw input data, the source table comprising multiple rows and multiple columns, the raw input data comprising values in a first format, the values comprising input features corresponding to datasets included in the raw input data for machine learning models, the source table being provided by an external environment, the external environment comprising an external system from the database system. The subject technology performs a database operation to generate the particular table including table metadata, column metadata, and the generated cell data, the generated particular table comprising a second format that causes more efficient processing of data by the database system using a single query on the particular table compared to processing the raw input data from the source table.

    STORING FEATURE SETS USING SEMI-STRUCTURED DATA STORAGE

    公开(公告)号:US20230004571A1

    公开(公告)日:2023-01-05

    申请号:US17899160

    申请日:2022-08-30

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives, by a database system, raw input data from a source table provided by a machine learning development environment, the source table comprising multiple rows where each row includes multiple columns, the raw input data comprising values in a first format, the values comprising input features corresponding to datasets included in the raw input data for machine learning models, the machine learning development environment comprising an external system from the database system and is accessed by a plurality of different users that are external to the database system. The subject technology generates cell data for a feature store table based at least in part on the values from the source table. The subject technology performs at least one database operation to generate the feature store table including at least table metadata, column metadata, and the generated cell data.

Patent Agency Ranking