Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for restarting a query using a token. One of the methods includes receiving, by a computer from a requesting device, a query; determining, using a data storage system, a current result responsive to the query; generating, using the current result, a restart token that represents operations performed to determine a plurality of results responsive to the query including the current result responsive to the query and that can be used to determine a new result responsive to the query that was not included in the plurality of results responsive to the query; and providing, to the requesting device, a message that includes a) first data for the restart token that represents operations performed to determine the plurality of results responsive to the query and b) second data for the current result responsive to the query.
Abstract:
A multiversioned position-space indexing system is disclosed. The system includes data structures for maintaining a multiversioned position space including a multi-versioned filter merge list which represents many versions of a changing position space in a very compact form and a position shift map which describes how to translate stored positions in many different log-structured merge tree layers into logical positions at a particular timestamp. Each log-structured merge tree layer can be divided into two sublayers: a final sublayer and a correction sublayer. The final sublayer contains index entries added after the layer's start timestamp and remain live as of the layer's final timestamp as well as deletion makers for index entries that were inserted before the layer's start timestamp, but deleted before the layer's final timestamp. The correction layer contains index entries that were both created and deleted between the start and end timestamps of the layer.