Abstract:
Replication is improved in a globally distributed database, such as a replicated sharded database, which uses raft-based asynchronous database replication. Improvements include Raft log persistence, coordination of followers' processing speed, transaction outcome determination, and column name compression, and improved failover time through heartbeat consolidation and keeping apply processes of followers running across failovers.
Abstract:
Techniques are described for generating seasonal forecasts. According to an embodiment, a set of time-series data is associated with one or more classes, which may include a first class that represent a dense pattern that repeats over multiple instances of a season in the set of time-series data and a second class that represent another pattern that repeats over multiple instances of the season in the set of time-series data. A particular class of data is associated with at least two sub-classes of data, where a first sub-class represents high data points from the first class, and a second sub-class represents another set of data points from the first class. A trend rate is determined for a particular sub-class. Based at least in part on the trend rate, a forecast is generated.
Abstract:
Techniques are provided for eager replication of uncommitted transactions. In embodiments, a replication client receives, in a data stream, change records corresponding to database changes applied to a source database in a transaction. The change records does not include a commit record that indicates that the transaction is committed on the source database. Before receiving the commit record, the replication client computes transaction dependency data based on the change records and detects, based on the transaction dependency data, that the transaction can be at least partially applied to a target database. Also before receiving the commit record, the replication client applies, to a target database and based on the detecting, at least some of the change records. Upon receiving the commit record of the transaction, the replication client completes applying the change records and commits the transaction on the target database.
Abstract:
Techniques are described for characterizing and summarizing seasonal patterns detected within a time series. According to an embodiment, a set of time series data is analyzed to identify a plurality of instances of a season, where each instance corresponds to a respective sub-period within the season. A first set of instances from the plurality of instances are associated with a particular class of seasonal pattern. After classifying the first set of instances, a second set of instances may remain unclassified or otherwise may not be associated with the particular class of seasonal pattern. Based on the first and second set of instances, a summary may be generated that identifies one or more stretches of time that are associated with the particular class of seasonal pattern. The one or more stretches of time may span at least one sub-period corresponding to at least one instance in the second set of instances.
Abstract:
Techniques are provided for automatic parallelism tuning. At least one batch of change records is assigned to one or more apply processes in a set of active apply processes. A first throughput value is periodically determined based on a number of processed change records in a first time interval. An increment adjustment is periodically performed, including adding an additional apply process, determining a second throughput value, and removing the additional apply process from the set of active apply processes if the second throughput value is not greater than a previous first throughput value by at least an increment threshold. A decrement adjustment is periodically performed, including removing an apply process, determining a third throughput value, and replacing the removed apply process in the set of active apply processes if the third throughput value is not greater than the previous first throughput value by at least a decrement threshold.
Abstract:
Parallel logical replication involves multiple apply threads running on a destination database server applying, in parallel, changes made by source transactions, where the changes of a single source transaction may be applied in parallel by multiple apply threads. An apply transaction for a source transaction may be committed by an apply thread independently of the commitment of any other apply transaction of the source transaction, that is, without coordinating the committing of another apply transaction executed by another apply thread for the source transaction. A configuration language is used to configure parallel logical replication. The language facilitates the configuration of various aspects of parallel logical replication, including the number of apply threads, partitioning schemes for the apply threads for partitioning change records between the apply threads, and various other aspects of parallel logical replication.
Abstract:
Techniques are provided for data definition language (DDL) expression annotation. DDL expression text is captured. The DDL expression text corresponds to a DDL change in a source database. A component set is determined. The component set includes at least one component in the DDL expression text. An annotation set is generated. The annotation set includes at least one annotation for at least one component of the component set. Each annotation includes hierarchical data describing at least one hierarchical relationship in the component set. For example, an annotation may include a component ID, a component position, a component length, a component type, and a parent component ID. The annotation set and a change record comprising the DDL expression text are transmitted to a replication client.
Abstract:
A lead-sync log record is used to synchronize the replication logs of follower shards to the leader shard. In response to a failure to determine that there is a consensus for a database transaction commit operation after a shard server becomes a new leader, the new leader shard performs a sync operation using the lead-sync log record to synchronize replication logs of the follower shards to the replication log of the new leader. A shard server identifies a first transaction having a first log record but not a post-commit log record in the replication log, defines a recovery window in the replication log starting at the first log record of the identified first transaction and ending at the lead-sync log record, identifies a set of transactions to be recovered, and performs a recovery action on the set of transactions to be recovered.
Abstract:
A consensus protocol-based replication approach is provided. For each change operation performed by a leader server on a copy of the database, the leader server creates a replication log record and returns a result to the client. The leader does not wait for consensus for the change operation from the followers. For a commit, the leader creates a commit log record and waits for consensus. Thus, the leader executes database transactions asynchronously, performs replication of change operations asynchronously, and performs replication of transaction commits synchronously.
Abstract:
Techniques are described for automatically detecting and accommodating state changes in a computer-generated forecast. In one or more embodiments, a representation of a time-series signal is generated within volatile and/or non-volatile storage of a computing device. The representation may be generated in such a way as to approximate the behavior of the time-series signal across one or more seasonal periods. Once generated, a set of one or more state changes within the representation of the time-series signal is identified. Based at least in part on at least one state change in the set of one or more state changes, a subset of values from the sequence of values is selected to train a model. An analytical output is then generated, within volatile and/or non-volatile storage of the computing device, using the trained model.