摘要:
An improved system and method for applying once a transaction delivered in a message published asynchronously in a distributed database is provided. In various embodiments, apply once messaging may be achieved for asynchronous publication by having a persistent log stored on a messaging server. A messaging server may receive an update message for a transaction to be published asynchronously in a distributed database, may generate a sequence number for the transaction in a message, and may log the update message with the sequence number in a log file persistently stored on the messaging server. The messaging server may then send an acknowledgement that the update message is published and may asynchronously publish the update message with the sequence number to subscribers. The publication may only succeed if there may not be any message tagged with a sequence number that has been previously published by the messaging server.
摘要:
The subject matter disclosed herein relates to bulk loading of data into a database comprising a plurality of database partitions. In one particular example, the database partitioning may be revised before addition of the new data to the partitions.
摘要:
An improved system and method for loading records into a partitioned database table is provided. A translation of records may be generated from a set of source partitions to a set of target partitions by generating a bipartite graph, determining a maximal matching using dynamic programming for a chain of nodes remaining in the bipartite graph after removing singleton edges, and generating a maximal matching after adding back the singleton edges for translation of records from the set of source partitions to the set of target partitions. The partition translation may be executed by traversing from top to bottom the set of source partitions and the set of target partitions in record key order to generate an optimal sequence of operations to transfer the records from the set of source partitions to the set of target partitions.
摘要:
An improved system and method for loading records into a partitioned database table is provided. A translation of records may be generated from a set of source partitions to a set of target partitions by generating a bipartite graph, determining a maximal matching using dynamic programming for a chain of nodes remaining in the bipartite graph after removing singleton edges, and generating a maximal matching after adding back the singleton edges for translation of records from the set of source partitions to the set of target partitions. The partition translation may be executed by traversing from top to bottom the set of source partitions and the set of target partitions in record key order to generate an optimal sequence of operations to transfer the records from the set of source partitions to the set of target partitions.
摘要:
In a large-scale transaction such as the bulk loading of new records into an ordered, distributed database, a transaction limit such as an insert limit may be chosen, partitions on overfull storage servers may be designated to be moved to underfull storage servers, and the move assignments may be based, at least in part on the degree to which a storage server is underfull and the move and insertion costs of the partitions to be moved.
摘要:
Methods and system for providing social feeds from a plurality of third party sites to a user at a host site includes retrieving one or more access logs capturing online behavior of the user. The access logs are analyzed to determine the user's interactive behavioral pattern related to social feeds from each of the plurality of third party sites. A refresh schedule for the user is computed to refresh cache entries of social feeds at the host site based on the analysis of the user's online behavior at the social feeds. Cache entries of social feeds for the user are refreshed at the host site from the one or more of the plurality of third party sites at an allotted time specified by the refresh schedule.
摘要:
In a large-scale transaction such as the bulk loading of new records into an ordered, distributed database, a transaction limit such as an insert limit may be chosen, partitions on overfull storage servers may be designated to be moved to underfull storage servers, and the move assignments may be based, at least in part on the degree to which a storage server is underfull and the move and insertion costs of the partitions to be moved.
摘要:
Method, system, and programs for balancing work load in a distributed system. A plurality of multi-dimensional load metrics are received from a plurality of resource units in the distributed system. Based on the received plurality of multi-dimensional load metrics and a global statistical load model, a load deviance for each resource unit is computed. The plurality of resource units in the distributed system are then ranked based on the load deviance of each resource unit. At least one load balancing action is further determined based on the ranked resource units and at least one load balancing policy.
摘要:
Methods and system for providing social feeds from a plurality of third party sites to a user at a host site includes retrieving one or more access logs capturing online behavior of the user. The access logs are analyzed to determine the user's interactive behavioral pattern related to social feeds from each of the plurality of third party sites. A refresh schedule for the user is computed to refresh cache entries of social feeds at the host site based on the analysis of the user's online behavior at the social feeds. Cache entries of social feeds for the user are refreshed at the host site from the one or more of the plurality of third party sites at an allotted time specified by the refresh schedule.
摘要:
In a distributed system that includes multiple machines, a scheduler attempts to schedule a task on a machine that is not currently overloaded with work. If a task is scheduled on a machine that does not yet have copies of the portions of the data set on which the task needs to operate, then that machine obtains copies of those portions from other machines that already have them. Whenever a “source” machine ships a copy of a portion to another “destination” machine in the distributed system, the destination machine persistently stores that copy on the destination machine's persistent storage mechanism. The copy also remains on the source machine. Thus, portions of the data set are automatically replicated whenever those portions are shipped between machines of the distributed system. Each machine in the distributed system has access to “global” information that indicates which machines have which portions of the data set.