WATERMARK-BASED TECHNIQUES FOR CHANGE-DATA-CAPTURE

    公开(公告)号:US20210182267A1

    公开(公告)日:2021-06-17

    申请号:US17105830

    申请日:2020-11-27

    Applicant: NETFLIX, INC.

    Abstract: Various embodiments set forth systems and techniques for concurrent log and dump processing. The techniques include selecting, from a datastore, a chunk comprising one or more rows of data; comparing the one or more rows of data in the chunk with a first set of log events in a change log associated with the datastore, wherein each log event included in the first set of log events occurs after a first log event in the change log and prior to a second log event in the change log; selecting, based on the comparison, one or more non-conflicting rows in the chunk; and transmitting, to an output, one or more log events associated with the one or more non-conflicting rows in the chunk prior to processing a second set of log events in the change log, wherein the second set of log events occur after the second log event.

    TECHNIQUES FOR DYNAMICALLY BENCHMARKING CLOUD DATA STORE SYSTEMS

    公开(公告)号:US20180060154A1

    公开(公告)日:2018-03-01

    申请号:US15394448

    申请日:2016-12-29

    Applicant: NETFLIX, INC.

    CPC classification number: H04L67/1097 G06F11/34

    Abstract: In various embodiments, a benchmarking engine automatically tests a data store to assess functionality and/or performance of the data store. The benchmarking engine generates data store operations based on dynamically adjustable configuration data. As the benchmarking engine generates the data store operations, the data store operations execute on the data store. In a complementary fashion, as the data store operations execute on the data store, the benchmarking engine generates statistics based on the results of the executed data store operations. Advantageously, because the benchmarking engine adjusts the number and/or type of data store operations that the benchmarking engine generates based on any changes to the configuration data, the workload that executes on the data store may be fine-tuned as the benchmarking engine executes.

    TECHNIQUES FOR PERFORMING DATA RECONCILIATION IN DISTRIBUTED DATA STORE SYSTEMS

    公开(公告)号:US20170193031A1

    公开(公告)日:2017-07-06

    申请号:US14987649

    申请日:2016-01-04

    Applicant: NETFLIX, INC.

    Abstract: In one embodiment, a data reconciliation engine works with data store nodes included in a distributed data store system to ensure consistency between the data store nodes. In operation, the data reconciliation receives a different data snapshot from each of the data store nodes. In response, the data reconciliation engine generates one or more recommendations designed to resolve inconsistencies between the data snapshots. The data reconciliation engine then transmits each recommendation to a different data store node. Because the data reconciliation engine performs many of the resource-intensive operations included in the data reconciliation process, the resources of the data store nodes may focus primarily on processing client requests instead of performing data reconciliation operations. Consequently, unlike conventional data store node based reconciliation applications, the data reconciliation engine may process large volumes of data without unacceptably increasing the time required for the distributed data store system to respond to client requests.

    WATERMARK-BASED TECHNIQUES FOR CHANGE-DATA-CAPTURE

    公开(公告)号:US20220276993A1

    公开(公告)日:2022-09-01

    申请号:US17745739

    申请日:2022-05-16

    Applicant: NETFLIX, INC.

    Abstract: Various embodiments set forth systems and techniques for concurrent log and dump processing. The techniques include selecting, from a datastore, a chunk comprising one or more rows of data; comparing the one or more rows of data in the chunk with a first set of log events in a change log associated with the datastore, wherein each log event included in the first set of log events occurs after a first log event in the change log and prior to a second log event in the change log; selecting, based on the comparison, one or more non-conflicting rows in the chunk; and transmitting, to an output, one or more log events associated with the one or more non-conflicting rows in the chunk prior to processing a second set of log events in the change log, wherein the second set of log events occur after the second log event.

    TECHNIQUES FOR WARMING UP A NODE IN A DISTRIBUTED DATA STORE

    公开(公告)号:US20170353515A1

    公开(公告)日:2017-12-07

    申请号:US15379299

    申请日:2016-12-14

    Applicant: NETFLIX Inc.

    Abstract: In various embodiments, a node manager configures a “new” node as a replacement for an “unavailable” node that was previously included in a distributed data store. First, the node manager identifies a source node that stores client data that was also stored in the unavailable node. Subsequently, the node manager configures the new node to operate as a slave of the source node and streams the client data from the source node to the new node. Finally, the node manager configures the new node to operate as one of multiple masters nodes in the distributed data store. Advantageously, by configuring the node to implement a hybrid of a master-slave replication scheme and a master-master replication scheme, the node manager enables the distributed data store to process client requests without interruption while automatically restoring the previous level of redundancy provided by the distributed data store.

Patent Agency Ranking