-
公开(公告)号:US20240248905A1
公开(公告)日:2024-07-25
申请号:US18159667
申请日:2023-01-25
Applicant: VMware, Inc.
Inventor: Dimiter DIMITRIEV , Kostadin GEORGIEV , Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE
IPC: G06F16/25
CPC classification number: G06F16/254
Abstract: References to changing data sets in distributed data lakes are optimized. As part of a transaction, a first message is received. The first message identifies a table and first data to be written to the table. Based on at least the table, the first message is routed to a first ingestion node of a plurality of ingestion nodes. The first data is persisted in temporary storage. Location information of the persisted first data is determined. A data available message comprising a self-describing reference to the first data is published, by the first ingestion node, to a first reader node of a plurality of reader nodes. The self-describing reference identifies the first ingestion node, the location information of the first data, and a range of the first data.
-
公开(公告)号:US20200233801A1
公开(公告)日:2020-07-23
申请号:US16252488
申请日:2019-01-18
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Robert T. JOHNSON , Richard P. SPILLANE , Sandeep RANGASWAMY , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Srinath PREMACHANDRAN
IPC: G06F12/0804 , G06F16/22 , G06F16/2455 , G06F7/16
Abstract: Certain aspects provide systems and methods for performing an operation on a Bε-tree. A method comprises writing a message associated with the operation to a first slot in a first buffer of a first non-leaf node of the Bε-tree in an append-only manner, wherein a first filter associated with the first slot is used for query operations associated with the first slot. The method further comprises determining that the first buffer is full and, upon determining to flush the message to a non-leaf child node, flushing the message in an append-only manner to a second slot in a second buffer of the non-leaf child node, wherein a second filter associated with the second slot is used for query operations associated with the second slot. The method further comprises, upon determining to flush the message to a leaf node, flushing the message to the leaf node in a sorted manner.
-
公开(公告)号:US20190294709A1
公开(公告)日:2019-09-26
申请号:US15927019
申请日:2018-03-20
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Rob T. JOHNSON , Srinath PREMACHANDRAN , Richard P. SPILLANE , Sandeep RANGASWAMY , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Wenguang WANG
IPC: G06F17/30
Abstract: Exemplary methods, apparatuses, and systems include a file system process inserting a first key/value pair and a second key/value pair into a first tree. The second key is a duplicate of the first key and the value of the second key/value pair is an operation changing the value. In response to a request for a range of key/value pairs, the process reads the second key/value pair and inserts it in a second tree. The process reads the first pair and determines, while inserting the first pair in the second tree, that the second key is a duplicate of the first key. The file system process determines an updated value of the first value by applying the operation in the second value to first value. The file system operation updates the second key/value pair in the second tree with the updated value and returns the requested range of key/value pairs.
-
公开(公告)号:US20190294715A1
公开(公告)日:2019-09-26
申请号:US15927030
申请日:2018-03-20
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Rob T. JOHNSON , Srinath PREMACHANDRAN , Richard P. SPILLANE , Sandeep RANGASWAMY , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Wenguang WANG
Abstract: Exemplary methods, apparatuses, and systems include a file system process obtaining locks on a first node and a second node in a tree structure, with the second node being a child node of the first node. The file system process determines a quantity of child nodes of the second. While holding the locks on the first and second nodes, the file system determines whether to proactively split or merge the second node. In response to determining that the quantity of child nodes is within a first range, the file system process splits the second node. If the file system process determines that the quantity of child nodes is within a second range, the file system process merges the second node.
-
5.
公开(公告)号:US20190286360A1
公开(公告)日:2019-09-19
申请号:US16431648
申请日:2019-06-04
Applicant: VMware, Inc.
Inventor: Jorge GUERRA DELGADO , Jin ZHANG , Radhika VULLIKANTI , Abhishek GUPTA
Abstract: A logical group of data blocks stored in a first node is migrated to a second node according to a method that includes determining a first metric for each logical group of data blocks stored in the first node, the first metric representing a total size of the data blocks in the logical group, determining a second metric for each logical group of data blocks stored in the first node, the second metric representing a total size of the data blocks in the logical group that are uniquely stored in the first node, and selecting a logical group of data blocks for migration from the first node to the second node based on the first metric and the second metric.
-
公开(公告)号:US20190080107A1
公开(公告)日:2019-03-14
申请号:US15703706
申请日:2017-09-13
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Rick SPILLANE , Kapil CHOWKSEY , Rob JOHNSON , Wenguang WANG
Abstract: Embodiments of the present disclosure relate to techniques for performing a merge update for a database. In particular, certain embodiments of a method include generating a message comprising a first key and a first transaction associated with the first key, the first transaction indicating a transaction to perform other than for key-value pairs comprising the first key. The method further includes storing the message in a database. The method further includes merging the message with a first key-value pair stored in the database, the first-key value pair comprising the first key. The method further includes performing the first transaction based on merging the message with the first key-value pair.
-
公开(公告)号:US20230385265A1
公开(公告)日:2023-11-30
申请号:US17827795
申请日:2022-05-30
Applicant: VMware, Inc.
Inventor: Christos KARAMANOLIS , Abhishek GUPTA , Richard P. SPILLANE , Marin NOZHCHEV
CPC classification number: G06F16/2365 , G06F16/2282 , G06F11/1435
Abstract: A version control interface provides for accessing a data lake with transactional semantics. Examples generate a plurality of tables for data objects stored in the data lake. The tables each comprise a set of name fields and map a space of columns or rows to a set of the data objects. Transactions read and write data objects and may span a plurality of tables with properties of atomicity, consistency, isolation, durability (ACID). Performing the transaction comprises: accumulating transaction-incomplete messages, indicating that the transaction is incomplete, until a transaction-complete message is received, indicating that the transaction is complete. Upon this occurring, a master branch is updated to reference the data objects according to the transaction-incomplete messages and the transaction-complete message. Tables may be grouped into data groups that provide atomicity boundaries so that different groups may be served by different master branches, thereby improving the speed of master branch updates.
-
公开(公告)号:US20240248879A1
公开(公告)日:2024-07-25
申请号:US18159677
申请日:2023-01-25
Applicant: VMware, Inc.
Inventor: Dimiter DIMITRIEV , Kostadin GEORGIEV , Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE
IPC: G06F16/172 , G06F16/11 , G06F16/17
CPC classification number: G06F16/172 , G06F16/122 , G06F16/1724
Abstract: Storage file size in distributed data lakes is optimized. At a first ingestion node of a plurality of ingestion nodes, a merge advisory is received from a coordinator. The merge advisory indicates a transaction identifier (ID). Received data associated with the transaction ID is persisted, which includes: determining whether the received data, persisted together in a single file will exceed a maximum desired file size; based on determining that the maximum desired file size will not be exceeded, persisting the received data in a single file; and based on determining that the maximum desired file size will be exceeded, persisting the received data in a plurality of files that each does not exceed the maximum desired file size. A location of the persisted received data in the permanent storage is identified, by the first ingestion node, to the coordinator.
-
公开(公告)号:US20240126744A1
公开(公告)日:2024-04-18
申请号:US17967286
申请日:2022-10-17
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE , Martin DEKOV , Ivo STRATEV
CPC classification number: G06F16/2379 , G06F16/2282
Abstract: Intelligent, transaction-aware table placement minimizes cross-host transactions while supporting full transactional semantics and delivering high throughput at low resource utilization. This placement reducing delays caused by cross-host transaction coordination. Examples determine a count of historical interactions between tables, based on at least a transaction history for a plurality of cross-table transactions. Each table provides an abstraction for data, such as by identifying data objects stored in a data lake. For tables on different hosts, having high count of historical interactions, potential cost savings achievable by moving operational control of a first table to the same host as the second table is compared with the potential cost savings achievable by moving operational control of the second table to the same host as the first table. Based on comparing the relative cost savings, one of the tables may be selected. Operational control of the selected table is moved without moving any of the data objects.
-
公开(公告)号:US20230205757A1
公开(公告)日:2023-06-29
申请号:US17564206
申请日:2021-12-28
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Richard P. SPILLANE , Christos KARAMANOLIS , Marin NOZHCHEV
CPC classification number: G06F16/2379 , G06F16/2246
Abstract: A version control interface for data provides a layer of abstraction that permits multiple readers and writers to access data lakes concurrently. An overlay file system, based on a data structure such as a tree, is used on top of one or more underlying storage instances to implement the interface. Each tree node tree is identified and accessed by means of any universally unique identifiers. Copy-on-write with the tree data structure implements snapshots of the overlay file system. The snapshots support a long-lived master branch, with point-in-time snapshots of its history, and one or more short-lived private branches. As data objects are written to the data lake, the private branch corresponding to a writer is updated. The private branches are merged back into the master branch using any merging logic, and conflict resolution policies are implemented. Readers read from the updated master branch or from any of the private branches.
-
-
-
-
-
-
-
-
-