Inferring a dataset schema from input files

    公开(公告)号:US11907181B2

    公开(公告)日:2024-02-20

    申请号:US16748351

    申请日:2020-01-21

    Inventor: Nir Ackner Eric Lin

    CPC classification number: G06F16/211 G06F3/0638 G06F40/205

    Abstract: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    INFERRING A DATASET SCHEMA FROM INPUT FILES
    2.
    发明申请

    公开(公告)号:US20190108244A1

    公开(公告)日:2019-04-11

    申请号:US16210984

    申请日:2018-12-05

    Inventor: Nir Ackner Eric Lin

    CPC classification number: G06F16/211 G06F3/0638 G06F17/2705

    Abstract: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    Inferring a dataset schema from input files

    公开(公告)号:US12210491B2

    公开(公告)日:2025-01-28

    申请号:US18438301

    申请日:2024-02-09

    Inventor: Nir Ackner Eric Lin

    Abstract: A method comprises selecting a sample excerpt from a data input file; in response to the determining that a first row in the sample excerpt does not contain a delimited value and a second row does contain a delimited value, determining that the first row consists of header data; identifying one or more jagged rows based on row delimiters that were erroneously placed; causing displaying text that led to creation of a jagged row; receiving an addition or removal of a specific row delimiter to the text; updating the sample excerpt based on the addition or the removal; analyzing the sample excerpt to determine a row delimiter for the data input file; identifying a plurality of rows that is not included in the header data; identifying a plurality of candidate column delimiters and generating a candidate schema for the data input file.

    INFERRING A DATASET SCHEMA FROM INPUT FILES
    4.
    发明公开

    公开(公告)号:US20240184754A1

    公开(公告)日:2024-06-06

    申请号:US18438301

    申请日:2024-02-09

    Inventor: Nir Ackner Eric Lin

    CPC classification number: G06F16/211 G06F3/0638 G06F40/205

    Abstract: A method comprises selecting a sample excerpt from a data input file; in response to the determining that a first row in the sample excerpt does not contain a delimited value and a second row does contain a delimited value, determining that the first row consists of header data; identifying one or more jagged rows based on row delimiters that were erroneously placed; causing displaying text that led to creation of a jagged row; receiving an addition or removal of a specific row delimiter to the text; updating the sample excerpt based on the addition or the removal; analyzing the sample excerpt to determine a row delimiter for the data input file; identifying a plurality of rows that is not included in the header data; identifying a plurality of candidate column delimiters and generating a candidate schema for the data input file.

    Inferring a dataset schema from input files

    公开(公告)号:US10204119B1

    公开(公告)日:2019-02-12

    申请号:US15654952

    申请日:2017-07-20

    Inventor: Nir Ackner Eric Lin

    Abstract: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    INFERRING A DATASET SCHEMA FROM INPUT FILES
    6.
    发明申请

    公开(公告)号:US20200159704A1

    公开(公告)日:2020-05-21

    申请号:US16748351

    申请日:2020-01-21

    Inventor: Nir Ackner Eric Lin

    Abstract: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

    Inferring a dataset schema from input files

    公开(公告)号:US10540333B2

    公开(公告)日:2020-01-21

    申请号:US16210984

    申请日:2018-12-05

    Inventor: Nir Ackner Eric Lin

    Abstract: Techniques for generating a schema for a data input file are described herein. In an embodiment, a server computer receives a data input file. The server computer system selects a sample excerpt from the data input which comprises a subset of the data input file. The server computer system analyzes the sample excerpt to determine a row delimiter for the data input file, a column delimiter for the data input file, and a plurality of data format types. Using the column delimiter, row delimiter, and plurality of data format types, the server computer system generates a candidate schema for the data input file.

Patent Agency Ranking