-
公开(公告)号:US11567839B2
公开(公告)日:2023-01-31
申请号:US17512337
申请日:2021-10-27
Applicant: Microsoft Technology Licensing, LLC
Inventor: Alexander Budovski , Cristian Diaconu , Sandeep Lingam , Alejandro Hernandez Saenz , Naveen Prakash , Krystyna Ewa Reisteter , Rogerio Ramos , Huanhui Hu , Peter Byrne
Abstract: Embodiments described herein detect data corruption in a distributed data set system. For example, a system comprises node(s) for processing queries with respect to a distributed data set comprising a plurality of storage segments. A write transaction resulting from a query with respect to a particular storage segment is logged in a log record that describes a modification to the storage segment. A log service provides the log record to a data server managing a portion of the distributed data set in which the storage segment is included, which performs the write transaction with respect to the storage segment. For redundancy purposes, the data server has replica(s) that manage respective replicas of the portion of the distributed data set managed thereby. For backup purposes, snapshots of the replica(s) are periodically generated. To determine a data corruption, a snapshot of one replica is cross-validated with a snapshot of another replica.
-
公开(公告)号:US11455292B2
公开(公告)日:2022-09-27
申请号:US16138238
申请日:2018-09-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Cristian Diaconu , Naveen Prakash , Alexander Budovski , Huanhui Hu , Alejandro Hernandez Saenz
Abstract: Brokering log records so as to prevent log records that are not yet persisted in a persistent log from being disseminated. The log records may be generated as a primary compute system performs operations. Upon receiving a request for a log record, the broker component determines whether the requested log record has been persisted in a persistent log. If the broker component determines that the log record has been persisted in the persistent log, the broker component responds to the request by causing the requested log record to be provided to the requesting entity (e.g., a secondary compute system). On the other hand, if the log record cannot yet determine that the log record has been persisted in the persistent log, the broker component prevents the log record from being provided to the requesting entity. This prevents data from being inconsistent during recovery.
-
公开(公告)号:US12118014B2
公开(公告)日:2024-10-15
申请号:US18351258
申请日:2023-07-12
Applicant: Microsoft Technology Licensing, LLC
Inventor: Alejandro Hernandez Saenz , Cristian Diaconu , Krystyna Ewa Reisteter , Naveen Prakash , Sheetal Shrotri , Rogério Ramos , Alexander Budovski , Hanumantha Rao Kodavalla
IPC: G06F16/00 , G06F16/25 , G06F16/27 , G06F16/22 , G06F16/2455
CPC classification number: G06F16/256 , G06F16/278 , G06F16/2272 , G06F16/24557
Abstract: Distributed database systems including compute nodes and page servers are described herein that enable separating logical and physical storage of database files in a distributed database system. A distributed database system includes a page server and a compute node, and is configured to store a logical database file that includes data and is associated with a file identifier. Each page server is configurable to store slices (i.e., subportions) of the logical database file. The compute node is coupled to the plurality of page servers and configured to store the logical database file responsive to a received command. In an aspect, such storage may comprise slicing the data comprising the logical database file into a set of slices with each being associated with a respective page server, maintaining an endpoint mapping for each slice of the first set of slices, and transmitting each slice to the associated for storage thereby.
-
公开(公告)号:US11249866B1
公开(公告)日:2022-02-15
申请号:US17237707
申请日:2021-04-22
Applicant: Microsoft Technology Licensing, LLC
Inventor: Alexander Budovski , Cristian Diaconu , Sandeep Lingam , Alejandro Hernandez Saenz , Naveen Prakash , Krystyna Ewa Reisteter , Rogerio Ramos , Huanhui Hu , Peter Byrne
Abstract: Embodiments described herein detect data corruption in a distributed data set system. For example, a system comprises node(s) for processing queries with respect to a distributed data set comprising a plurality of storage segments. A write transaction resulting from a query with respect to a particular storage segment is logged in a log record that describes a modification to the storage segment. A log service provides the log record to a data server managing a portion of the distributed data set in which the storage segment is included, which performs the write transaction with respect to the storage segment. For redundancy purposes, the data server has replica(s) that manage respective replicas of the portion of the distributed data set managed thereby. For backup purposes, snapshots of the replica(s) are periodically generated. To determine a data corruption, a snapshot of one replica is cross-validated with a snapshot of another replica.
-
公开(公告)号:US10802715B2
公开(公告)日:2020-10-13
申请号:US16138139
申请日:2018-09-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Cristian Diaconu , Alejandro Hernandez Saenz , Naveen Prakash , Alexander Budovski
Abstract: The mounting a drive to two or more computing systems. For instance, the drive may be mounted to a first computing system so as to be writable (and potentially readable) by the first computing system. But also, the drive is also mounted to one or more other computing systems so as to be only readable by those one or more computing systems. This allows for multiple computing systems to have access to the drive without risk that the data thereon will become corrupt. In one embodiment, the only user data stored on that drive is a single file of fixed size. Thus, even when user data is written into the fixed-size file, the management data stored (that keeps track of the files) on the drive does not change.
-
公开(公告)号:US11797523B2
公开(公告)日:2023-10-24
申请号:US17180519
申请日:2021-02-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Craig S. Freedman , Adrian-Leonard Radu , Daniel G. Schall , Hanumantha R. Kodavalla , Panagiotis Antonopoulos , Raghavendra Thallam Kodandaramaih , Alejandro Hernandez Saenz , Naveen Prakash
IPC: G06F16/00 , G06F16/23 , G06F16/21 , G06F16/2455 , G06F16/27
CPC classification number: G06F16/2379 , G06F16/211 , G06F16/2455 , G06F16/27
Abstract: Distributed database systems including compute nodes and page servers are described herein that enable compute nodes to pushdown certain query processing compute tasks to the page servers to take advantage of otherwise idle compute resources at the page servers, and to reduce the quantity of data that moves between compute nodes and page servers. A distributed database system includes a page server and a compute node, wherein the page server is configured to maintain multiple versions of stored data objects. The compute node is configured to receive a query and generate a transaction context (TC) and modified table schemas (MTS) scoped to the query, and pushdown the query, TC and MTS to the page server that is configured to determine which data objects at the page server satisfy the query, and for each such object, which version of the object should be returned based on the TC.
-
公开(公告)号:US11151101B2
公开(公告)日:2021-10-19
申请号:US16138373
申请日:2018-09-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Cristian Diaconu , Alejandro Hernandez Saenz
Abstract: Adaptive adjusting of the growth of a persistent log. The persistent log has a log record generator that adds log records to the persistent log. In addition, there are multiple log consumers that consume records from the persistent log. The log consumers publish log processing parameters with respect to the persistent log. The log processing parameters are then used to determine an appropriate adjustment in the growth of the log, which adjustments may then be executed. As an example, the log processing parameter may be a log consumption progress, in which case the log generator may be caused to slow down the generation of log records, thereby slowing the growth of the log.
-
公开(公告)号:US10949412B2
公开(公告)日:2021-03-16
申请号:US16138057
申请日:2018-09-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Cristian Diaconu , Naveen Prakash , Alexander Budovski , Alejandro Hernandez Saenz
Abstract: The use of log marking (otherwise known as “coloring”) of sub-portions of a log that records actions (e.g., data operations) performed by a computing system. The log is composed of multiple sub-portions, such as virtual log files, which are successively added to the log as the log grows. For instance, the sub-portions may be virtual log files of the log. The principles described herein change the use of log marking depending on which sub-portion of the log is being marked. If the computing system fails, and recovery is needed, the recovery process can thus deterministically identify where the last written log record is.
-
公开(公告)号:US11880318B2
公开(公告)日:2024-01-23
申请号:US17705981
申请日:2022-03-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Rogério Ramos , Kareem Aladdin Golaub , Chaitanya Gottipati , Alejandro Hernandez Saenz , Raj Kripal Danday
IPC: G06F13/16
CPC classification number: G06F13/1673
Abstract: Methods for local page writes via pre-staging buffers for resilient buffer pool extensions are performed by computing systems. Compute nodes in database systems insert, update, and query data pages maintained in storage nodes. Data pages cached locally by compute node buffer pools are provided to buffer pool extensions on local disks as pre-copies via staging buffers that store data pages prior to local disk storage. Encryption of data pages occurs at the staging buffers, which allows a less restrictive update latching during the copy process, with page metadata being updated in buffer pool extensions page tables with in-progress states indicating it is not yet written to local disk. When stage buffers are filled, data pages are written to buffer pool extensions and metadata is updated in page tables to indicate available/valid states. Data pages in staging buffers can be read and updated prior to writing to the local disk.
-
公开(公告)号:US11860829B2
公开(公告)日:2024-01-02
申请号:US17180508
申请日:2021-02-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Craig S. Freedman , Adrian-Leonard Radu , Daniel G. Schall , Hanumantha R. Kodavalla , Panagiotis Antonopoulos , Raghavendra Thallam Kodandaramaih , Alejandro Hernandez Saenz , Naveen Prakash
IPC: G06F16/21 , G06F16/24 , G06F16/245
CPC classification number: G06F16/21 , G06F16/245
Abstract: Methods for page split detection and affinity in query processing pushdowns are performed by systems and devices. Page servers perform pushdown operations based on specific, and specifically formatted or generated, information, instructions, and data provided thereto from a compute node. Page servers also determine that page splits have occurred during reading of data pages maintained by page servers during pushdown operations, and also during fulfillment of compute node data requests. To detect a data page has split, page servers utilize information from a compute node of an expected next data page which is compared to a next data page in the page server page index. A mismatch in the comparison by page servers indicates data page was split. Compute nodes and page servers store and maintain off-row data generated during data operations via page affinity considerations where the off-row data is stored at the same page server as the data.
-
-
-
-
-
-
-
-
-