-
公开(公告)号:US20200242034A1
公开(公告)日:2020-07-30
申请号:US16256726
申请日:2019-01-24
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Christoph KLEE , Adrian DRZEWIECKI , Christos KARAMANOLIS , Richard P. SPILLANE , Maxime AUSTRUY
IPC: G06F12/0815 , G06F12/1027
Abstract: The present disclosure provides techniques for managing a cache of a computer system using a cache management data structure. The cache management data structure includes a cold queue, a ghost queue, and a hot queue. The techniques herein improve the functioning of the computer because management of the cache management data structure can be performed in parallel with multiple cores or multiple processors, because a sequential scan will only pollute (i.e., add unimportant memory pages) cold queue, and to an extent, ghost queue, but not hot queue, and also because the cache management data structure has lower memory requirements and lower CPU overhead on cache hit than some prior art algorithms.
-
公开(公告)号:US20200241939A1
公开(公告)日:2020-07-30
申请号:US16256713
申请日:2019-01-24
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Christoph KLEE , Adrian DRZEWIECKI , Christos KARAMANOLIS , Richard P. SPILLANE , Maxime AUSTRUY
IPC: G06F9/54
Abstract: The disclosure provides an approach for performing an operation by a first process on behalf of a second process, the method comprising: obtaining, by the first process, a memory handle from the second process, wherein the memory handle allows access, by the first process, to at least some of the address space of the second process; dividing the address space of the memory handle into a plurality of sections; receiving, by the first process, a request from the second process to perform an operation; determining, by the first process, a section of the plurality of sections that is to be mapped from the address space of the memory handle to the address space of the first process for the performance of the operation by the first process; mapping the section from the address space of the memory handle to the address space of the first process; and performing the operation by the first process on behalf of the second process.
-
公开(公告)号:US20190294715A1
公开(公告)日:2019-09-26
申请号:US15927030
申请日:2018-03-20
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Rob T. JOHNSON , Srinath PREMACHANDRAN , Richard P. SPILLANE , Sandeep RANGASWAMY , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Wenguang WANG
Abstract: Exemplary methods, apparatuses, and systems include a file system process obtaining locks on a first node and a second node in a tree structure, with the second node being a child node of the first node. The file system process determines a quantity of child nodes of the second. While holding the locks on the first and second nodes, the file system determines whether to proactively split or merge the second node. In response to determining that the quantity of child nodes is within a first range, the file system process splits the second node. If the file system process determines that the quantity of child nodes is within a second range, the file system process merges the second node.
-
公开(公告)号:US20240248879A1
公开(公告)日:2024-07-25
申请号:US18159677
申请日:2023-01-25
Applicant: VMware, Inc.
Inventor: Dimiter DIMITRIEV , Kostadin GEORGIEV , Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE
IPC: G06F16/172 , G06F16/11 , G06F16/17
CPC classification number: G06F16/172 , G06F16/122 , G06F16/1724
Abstract: Storage file size in distributed data lakes is optimized. At a first ingestion node of a plurality of ingestion nodes, a merge advisory is received from a coordinator. The merge advisory indicates a transaction identifier (ID). Received data associated with the transaction ID is persisted, which includes: determining whether the received data, persisted together in a single file will exceed a maximum desired file size; based on determining that the maximum desired file size will not be exceeded, persisting the received data in a single file; and based on determining that the maximum desired file size will be exceeded, persisting the received data in a plurality of files that each does not exceed the maximum desired file size. A location of the persisted received data in the permanent storage is identified, by the first ingestion node, to the coordinator.
-
公开(公告)号:US20240126744A1
公开(公告)日:2024-04-18
申请号:US17967286
申请日:2022-10-17
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE , Martin DEKOV , Ivo STRATEV
CPC classification number: G06F16/2379 , G06F16/2282
Abstract: Intelligent, transaction-aware table placement minimizes cross-host transactions while supporting full transactional semantics and delivering high throughput at low resource utilization. This placement reducing delays caused by cross-host transaction coordination. Examples determine a count of historical interactions between tables, based on at least a transaction history for a plurality of cross-table transactions. Each table provides an abstraction for data, such as by identifying data objects stored in a data lake. For tables on different hosts, having high count of historical interactions, potential cost savings achievable by moving operational control of a first table to the same host as the second table is compared with the potential cost savings achievable by moving operational control of the second table to the same host as the first table. Based on comparing the relative cost savings, one of the tables may be selected. Operational control of the selected table is moved without moving any of the data objects.
-
公开(公告)号:US20230205757A1
公开(公告)日:2023-06-29
申请号:US17564206
申请日:2021-12-28
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Richard P. SPILLANE , Christos KARAMANOLIS , Marin NOZHCHEV
CPC classification number: G06F16/2379 , G06F16/2246
Abstract: A version control interface for data provides a layer of abstraction that permits multiple readers and writers to access data lakes concurrently. An overlay file system, based on a data structure such as a tree, is used on top of one or more underlying storage instances to implement the interface. Each tree node tree is identified and accessed by means of any universally unique identifiers. Copy-on-write with the tree data structure implements snapshots of the overlay file system. The snapshots support a long-lived master branch, with point-in-time snapshots of its history, and one or more short-lived private branches. As data objects are written to the data lake, the private branch corresponding to a writer is updated. The private branches are merged back into the master branch using any merging logic, and conflict resolution policies are implemented. Readers read from the updated master branch or from any of the private branches.
-
公开(公告)号:US20210064589A1
公开(公告)日:2021-03-04
申请号:US16552880
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/215 , G06F16/22
Abstract: The present disclosure provides techniques for scaling out deduplication of files among a plurality of nodes. The techniques include designating a master component for the coordination of deduplication. The master component divides files to be deduplicated among several slave nodes, and provides to each slave node a set of unique identifiers that are to be assigned to chunks during the deduplication process. The techniques herein preserve integrity of the deduplication process that has been scaled out among several nodes. The scaled out deduplication process deduplicates files faster by allowing several deduplication modules to work in parallel to deduplicate files.
-
公开(公告)号:US20210026825A1
公开(公告)日:2021-01-28
申请号:US16931219
申请日:2020-07-16
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Richard P. SPILLANE , Robert T. JOHNSON , Srinath PREMACHANDRAN , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Sandeep RANGASWAMY
Abstract: Embodiments described herein are related to a method of scanning a B-tree. For example, a method comprises receiving a scan request to scan a B-tree having a plurality of levels, each level comprising one or more nodes, wherein for each of one or more levels of the plurality of levels, nodes are grouped into groups, where nodes of any given group are stored across sequential disk blocks. The method further comprises generating a queue for each level of the B-tree. For each queue, the method further comprises loading into memory a next group of nodes based upon determining a storage location of a node of the next group of nodes.
-
19.
公开(公告)号:US20200183905A1
公开(公告)日:2020-06-11
申请号:US16212550
申请日:2018-12-06
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Richard P. SPILLANE , Junlong GAO , Robert T. JOHNSON , Christos KARAMANOLIS , Maxime AUSTRUY
Abstract: Certain aspects provide systems and methods of compacting data within a log-structured merge tree (LSM tree) using sharding. In certain aspects, a method includes determining a size of the LSM tree, determining a compaction time for a compaction of the LSM tree based on the size, determining a number of compaction entities for performing the compaction in parallel based on the compaction time, determining a number of shards based on the number of compaction entities, and determining a key range associated with the LSM tree. The method further comprises dividing the key range by the number of shards into a number of sub key ranges, wherein each of the number of sub key ranges corresponds to a shard of the number of shards and assigning the number of shards to the number of compaction entities for compaction.
-
20.
公开(公告)号:US20190294716A1
公开(公告)日:2019-09-26
申请号:US15927016
申请日:2018-03-20
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Rob T. JOHNSON , Srinath PREMACHANDRAN , Richard P. SPILLANE , Sandeep RANGASWAMY , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Wenguang WANG
IPC: G06F17/30
Abstract: Exemplary methods, apparatuses, and systems include a file system process reading a first node in a tree data structure from a first memory. The first node includes a first approximate membership query data structure (“AMQ”), a first plurality of child pointers, a first plurality of pivot values, and a first buffer. The file system process determines that the first plurality of child pointers exceeds a maximum size. Using a pivot value in the first plurality of pivot values, the file system process splits the first node into a second node and a third node. The file system process uses the pivot value to split the first buffer into a second buffer and a third buffer. Using the pivot value and the first AMQ, the file system process generates a second AMQ and a third AMQ.
-
-
-
-
-
-
-
-
-