摘要:
Systems and methods are provided for efficient calculation of sets of distinct results in an information retrieval service. A query is received having at least one requested attribute and one or more conditions. For each row identifier in a database table that matches the one or more conditions, a tuple of value identifiers having an entry for each requested attribute is calculated. A unique number is generated and assigned to the tuple for each distinct combination of the value identifiers. Duplicate entries in the tuple listing are identified and removed, so that a result set provides only distinct results.
摘要:
A join operation between split data tables includes providing reduction data from first partitions to each partition among second partitions. The reduction data serves to identify actual values in one of the second partitions that also occur in one of the first partitions. Global IDs are assigned. Translation lists including the global IDs are sent to the first partitions. Each first partition and each second partition create globalized lists which can then be combined to generate respective first and second compiled lists. The join operation can then be conducted on the first and second compiled lists.
摘要:
Methods and apparatus, including computer program products, for selection of rows and values from indexes with updates. In general, rows of an index may be associated with validity flags that indicate whether a row has been updated with an update inserted in a delta index; one scheme for value identifiers may be used for an index and another scheme for one or more delta indexes where all of the indexes are, to at least some extent, compressed according to dictionary-based compression; and multiple delta indexes may be used in alternation such that one delta index may accept updates while another is being updated. The delta indexes may also have validity flags and all updates, such as modifications of values, deletion of records, and inserting of new records may be handled as updates accepted by one or more delta indexes.
摘要:
Methods and apparatus, including computer program products, for selection of rows and values from indexes with updates. In general, rows of an index may be associated with validity flags that indicate whether a row has been updated with an update inserted in a delta index; one scheme for value identifiers may be used for an index and another scheme for one or more delta indexes where all of the indexes are, to at least some extent, compressed according to dictionary-based compression; and multiple delta indexes may be used in alternation such that one delta index may accept updates while another is being updated. The delta indexes may also have validity flags and all updates, such as modifications of values, deletion of records, and inserting of new records may be handled as updates accepted by one or more delta indexes.
摘要:
Methods and apparatus, including computer program products, for selection of rows and values from indexes with updates. In general, rows of an index may be associated with validity flags that indicate whether a row has been updated with an update inserted in a delta index; one scheme for value identifiers may be used for an index and another scheme for one or more delta indexes where all of the indexes are, to at least some extent, compressed according to dictionary-based compression; and multiple delta indexes may be used in alternation such that one delta index may accept updates while another is being updated. The delta indexes may also have validity flags and all updates, such as modifications of values, deletion of records, and inserting of new records may be handled as updates accepted by one or more delta indexes.
摘要:
A join operation between split data tables includes providing reduction data from first partitions to each partition among second partitions. The reduction data serves to identify actual values in one of the second partitions that also occur in one of the first partitions. Global IDs are assigned. Translation lists including the global IDs are sent to the first partitions. Each first partition and each second partition create globalized lists which can then be combined to generate respective first and second compiled lists. The join operation can then be conducted on the first and second compiled lists.
摘要:
A search query for a collection of electronic documents is parsed to identify one or more terms and such identified terms are associated with one or more languages (i.e., spoken languages such as English, German, Spanish, etc.). A terms inverted index and a language inverted index are accessed to identify documents responsive to the query. Related apparatus, systems, techniques and articles are also described.
摘要:
Methods and apparatus, including computer program products, for selection of rows and values from indexes with updates. In general, rows of an index may be associated with validity flags that indicate whether a row has been updated with an update inserted in a delta index; one scheme for value identifiers may be used for an index and another scheme for one or more delta indexes where all of the indexes are, to at least some extent, compressed according to dictionary-based compression; and multiple delta indexes may be used in alternation such that one delta index may accept updates while another is being updated. The delta indexes may also have validity flags and all updates, such as modifications of values, deletion of records, and inserting of new records may be handled as updates accepted by one or more delta indexes.
摘要:
Systems and methods are provided for efficient calculation of sets of distinct results in an information retrieval service. A query is received having at least one requested attribute and one or more conditions. For each row identifier in a database table that matches the one or more conditions, a tuple of value identifiers having an entry for each requested attribute is calculated. A unique number is generated and assigned to the tuple for each distinct combination of the value identifiers. Duplicate entries in the tuple listing are identified and removed, so that a result set provides only distinct results.
摘要:
Methods and apparatus, including computer program products, for selection of rows and values from indexes with updates. In general, rows of an index may be associated with validity flags that indicate whether a row has been updated with an update inserted in a delta index; one scheme for value identifiers may be used for an index and another scheme for one or more delta indexes where all of the indexes are, to at least some extent, compressed according to dictionary-based compression; and multiple delta indexes may be used in alternation such that one delta index may accept updates while another is being updated. The delta indexes may also have validity flags and all updates, such as modifications of values, deletion of records, and inserting of new records may be handled as updates accepted by one or more delta indexes.