Data ingestion with spatial and temporal locality

    公开(公告)号:US12072892B2

    公开(公告)日:2024-08-27

    申请号:US17983755

    申请日:2022-11-09

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to write data records. In some implementations, a method may include calculating a data rate of a data stream that includes a plurality of data records and determining if the data rate of the data stream is less than an ingest threshold. The method may further include, if the data rate of the data stream is less than the ingest threshold, calculating a number of write requests per time unit based on the data stream; determining a storage capacity per storage bucket; determining a read interval for the data stream; based on the number of write requests per time unit, the storage capacity, and the read interval, selecting a size of time window per storage bucket; and writing the plurality of data records to a particular storage bucket.

    Data ingestion with spatial and temporal locality

    公开(公告)号:US11520794B2

    公开(公告)日:2022-12-06

    申请号:US17144054

    申请日:2021-01-07

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to write data records. In some implementations, a method may include calculating a data rate of a data stream that includes a plurality of data records and determining if the data rate of the data stream is less than an ingest threshold. The method may further include, if the data rate of the data stream is less than the ingest threshold, calculating a number of write requests per time unit based on the data stream; determining a storage capacity per storage bucket; determining a read interval for the data stream; based on the number of write requests per time unit, the storage capacity, and the read interval, selecting a size of time window per storage bucket; and writing the plurality of data records to a particular storage bucket.

    Analysis of streaming data using deltas and snapshots

    公开(公告)号:US12130776B2

    公开(公告)日:2024-10-29

    申请号:US18078665

    申请日:2022-12-09

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to obtain snapshots used for analysis of streaming data. In some implementations, a computer-implemented method includes receiving initial data that includes a plurality of identifiers and corresponding timestamps, generating and storing a snapshot based on the initial data, wherein the snapshot includes the identifiers and a corresponding status, receiving a data stream that includes a subset of the identifiers, activity information for each identifier in the subset, and corresponding timestamps. The method further includes periodically analyzing the data stream to obtain a delta that includes an updated status for each identifier in the subset, storing the delta separate from the snapshot. The method further includes receiving a request for identifiers that are active in a particular time period, and based on the particular time period, retrieving active identifiers from the data stream, the delta, or the snapshot.

    DATA INGESTION WITH SPATIAL AND TEMPORAL LOCALITY

    公开(公告)号:US20240004883A1

    公开(公告)日:2024-01-04

    申请号:US17983755

    申请日:2022-11-09

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to write data records. In some implementations, a method may include calculating a data rate of a data stream that includes a plurality of data records and determining if the data rate of the data stream is less than an ingest threshold. The method may further include, if the data rate of the data stream is less than the ingest threshold, calculating a number of write requests per time unit based on the data stream; determining a storage capacity per storage bucket; determining a read interval for the data stream; based on the number of write requests per time unit, the storage capacity, and the read interval, selecting a size of time window per storage bucket; and writing the plurality of data records to a particular storage bucket.

    ANALYSIS OF STREAMING DATA USING DELTAS AND SNAPSHOTS

    公开(公告)号:US20230359587A1

    公开(公告)日:2023-11-09

    申请号:US18078665

    申请日:2022-12-09

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to obtain snapshots used for analysis of streaming data. In some implementations, a computer-implemented method includes receiving initial data that includes a plurality of identifiers and corresponding timestamps, generating and storing a snapshot based on the initial data, wherein the snapshot includes the identifiers and a corresponding status, receiving a data stream that includes a subset of the identifiers, activity information for each identifier in the subset, and corresponding timestamps. The method further includes periodically analyzing the data stream to obtain a delta that includes an updated status for each identifier in the subset, storing the delta separate from the snapshot. The method further includes receiving a request for identifiers that are active in a particular time period, and based on the particular time period, retrieving active identifiers from the data stream, the delta, or the snapshot.

    Analysis of streaming data using deltas and snapshots

    公开(公告)号:US11537554B2

    公开(公告)日:2022-12-27

    申请号:US16918294

    申请日:2020-07-01

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to obtain snapshots used for analysis of streaming data. In some implementations, a computer-implemented method includes receiving initial data that includes a plurality of identifiers and corresponding timestamps, generating and storing a snapshot based on the initial data, wherein the snapshot includes the identifiers and a corresponding status, receiving a data stream that includes a subset of the identifiers, activity information for each identifier in the subset, and corresponding timestamps. The method further includes periodically analyzing the data stream to obtain a delta that includes an updated status for each identifier in the subset, storing the delta separate from the snapshot. The method further includes receiving a request for identifiers that are active in a particular time period, and based on the particular time period, retrieving active identifiers from the data stream, the delta, or the snapshot.

    DATA INGESTION WITH SPATIAL AND TEMPORAL LOCALITY

    公开(公告)号:US20210209115A1

    公开(公告)日:2021-07-08

    申请号:US17144054

    申请日:2021-01-07

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to write data records. In some implementations, a method may include calculating a data rate of a data stream that includes a plurality of data records and determining if the data rate of the data stream is less than an ingest threshold. The method may further include, if the data rate of the data stream is less than the ingest threshold, calculating a number of write requests per time unit based on the data stream; determining a storage capacity per storage bucket; determining a read interval for the data stream; based on the number of write requests per time unit, the storage capacity, and the read interval, selecting a size of time window per storage bucket; and writing the plurality of data records to a particular storage bucket.

    ANALYSIS OF STREAMING DATA USING DELTAS AND SNAPSHOTS

    公开(公告)号:US20210004352A1

    公开(公告)日:2021-01-07

    申请号:US16918294

    申请日:2020-07-01

    摘要: Implementations described herein relate to methods, systems, and computer-readable media to obtain snapshots used for analysis of streaming data. In some implementations, a computer-implemented method includes receiving initial data that includes a plurality of identifiers and corresponding timestamps, generating and storing a snapshot based on the initial data, wherein the snapshot includes the identifiers and a corresponding status, receiving a data stream that includes a subset of the identifiers, activity information for each identifier in the subset, and corresponding timestamps. The method further includes periodically analyzing the data stream to obtain a delta that includes an updated status for each identifier in the subset, storing the delta separate from the snapshot. The method further includes receiving a request for identifiers that are active in a particular time period, and based on the particular time period, retrieving active identifiers from the data stream, the delta, or the snapshot.