-
公开(公告)号:US12105690B1
公开(公告)日:2024-10-01
申请号:US17875176
申请日:2022-07-27
Applicant: Databricks Inc.
Inventor: Timothy Armstrong , Arvind Sai Krishnan , Khayyam Guliyev
IPC: G06F16/00 , G06F16/22 , G06F16/2455
CPC classification number: G06F16/2246 , G06F16/24552
Abstract: A system for multipass sort includes a communication interface and a processor. The communication interface is configured to receive from a client device a request to sort a dataset that includes a plurality of rows. The processor is configured to perform a first sort pass on the dataset in part by: extracting prefixes associated with a first schema element associated with the dataset for the plurality of rows; and sorting the extracted prefixes utilizing an integer sort algorithm based on a sort order included in the request to sort the dataset, where sorting the extracted prefixes includes utilizing NULL values to resolve a tied range that includes at least two rows of the plurality of rows having a same extracted prefix.