Abstract:
An improved system and method for writing data dependent upon multiple reads in a distributed database is provided. A client may read several data records and may then send a request to a database server to perform a transaction to write a data record dependent upon multiple data records read. A database server may receive the request specifying a transaction to write a data record dependent upon multiple data records read and may perform the transaction by latching a master data record to be written and validating the data records the write depends upon. The multiple data records upon which the write depends may be validated by verifying the multiple data records are current versions of the data records stored in the distributed database. Data intensive applications may use this transaction type in large scale distributed database systems to provide stronger consistency without significantly degrading performance and scalability.
Abstract:
Computer-implemented methods, modules and clients relate to expanded, pruned sample table for testing database queries against a base table. The expanded, pruned sample table is formed from the base table by a process of initial sampling, synthesis, and pruning.
Abstract:
A system and method for generating advertisements based on search intent. The system includes a query engine, and an advertisement engine. The query engine receives a query from the user. The query engine analyzes the query to determine a query intent that is matched to a predetermined domain. A translated query is generated including the domain type. Once a domain is selected, the query may be further analyzed to determine generic domain information. The domain and associated information may then be matched to a list of advertisements. The advertisement may be assigned an ad match score based on a correlation between the query information and various listing information provided in the advertisement.
Abstract:
System and apparatus for using block-level sampling for histograms construction as well as distinct-value estimations. For histogram construction, the system implements a two-phase adaptive method in which the sample size required to reach a desired accuracy is decided based on a first phase sample. This method is significantly faster than previous iterative block-level sampling methods proposed for the same problem. For distinct-value estimation, it is shown that existing estimators designed for uniform-random samples may perform very poorly with block-level samples. An exemplary system computes an appropriate subset of a block-level sample that is suitable for use with most existing estimators.
Abstract:
A method, an apparatus, a computer-program product, and a system for determining bandwidth for transmission of data packets are disclosed. A data packet in a plurality of data packets is received. An amount of bandwidth required for transmission of the received data packet is determined. The amount of bandwidth is a portion of a total available bandwidth for a radio link. At least one condition associated with the radio link for transmitting the received data packet to a user device is determined. Based on the determined amount of bandwidth and the determined condition, the received data packet is transmitted to the user device. Another data packet in the plurality of data packets is transmitted using another portion of the total available bandwidth.
Abstract:
A real-time messaging platform and method is disclosed which classifies messages in accordance with a combination of user engagement events as modified to reflect the temporal structure of the user engagement events. A message can be assigned a metric based, for example, on a weighted combination of user engagement rates, decayed with time to reflect an intuition that recent interactions by one or more users with the message will have a greater impact than older interactions with the message. Different types of interaction by one or more users with the message can be assigned different weights when the different engagement events are combined and, also, can be assigned different temporal characteristics.
Abstract:
Techniques that support trail-based exploration by a user of a repository of documents are described herein. In one embodiment, trail definition data that specifies a trail is received. The trail includes an ordered series of waypoints including a trailhead, intermediate waypoints, and one or more trailends. In some embodiments, deadends may also be defined in the trial. A particular waypoint in the ordered series of waypoints is established as a current waypoint. Search terms can be received from a user to cause a search to be performed. It is then determined whether the search satisfies matching criteria associated with a waypoint that immediately follows the current waypoint in the ordered series of waypoints. If so, the user advances to the next waypoint. Otherwise, the user remains at the current waypoint. Finally, if a trailend is reached, then an action such as rewarding the user in some way may be performed.
Abstract:
A technique is described that reduces the complexity and resource consumption associated with performing record expiry in a distributed database system. In accordance with the technique, a record is checked to see if it has expired only when it has been accessed for a read or a write. If at the time of a read a record is determined to have expired, then it is not served. If at the time of a write a record is determined to have expired, then the write is treated as an insertion of a new record, and steps are taken to treat the insertion consistently with regard to the previous expired version. A background process is used to delete records that have not been written to or actively deleted by a client after expiration.
Abstract:
A system is described for providing scalable in-memory caching for a distributed database. The system may include a cache, an interface, a non-volatile memory and a processor. The cache may store a cached copy of data items stored in the non-volatile memory. The interface may communicate with devices and a replication server. The non-volatile memory may store the data items. The processor may receive an update to a data item from a device to be applied to the non-volatile memory. The processor may apply the update to the cache. The processor may generate an acknowledgement indicating that the update was applied to the non-volatile memory and may communicate the acknowledgment to the device. The processor may then communicate the update to a replication server. The processor may apply the update to the non-volatile memory upon receiving an indication that the update was stored by the replication server.
Abstract:
An improved system and method for asynchronous update of indexes in a distributed database is provided. A database server may receive the request to update the data and may update the data in a primary data table of the distributed database. An asynchronous index update of the indexes may be initiated at the time a record is updated in a data table and then control may be returned to a client to perform another data update. An activity cache may be provided for caching the records updated by a client so that when the client requests a subsequent read, the updated records may be available in the activity cache to support the various guarantees for reading the data. Advantageously, the asynchronous index update scheme may provide increased performance and more scalability while efficiently maintaining indexes over database tables in a large scale, replicated, distributed database.