CPU-EFFICIENT CACHE REPLACMENT WITH TWO-PHASE EVICTION

    公开(公告)号:US20200242034A1

    公开(公告)日:2020-07-30

    申请号:US16256726

    申请日:2019-01-24

    Applicant: VMware, Inc.

    Abstract: The present disclosure provides techniques for managing a cache of a computer system using a cache management data structure. The cache management data structure includes a cold queue, a ghost queue, and a hot queue. The techniques herein improve the functioning of the computer because management of the cache management data structure can be performed in parallel with multiple cores or multiple processors, because a sequential scan will only pollute (i.e., add unimportant memory pages) cold queue, and to an extent, ghost queue, but not hot queue, and also because the cache management data structure has lower memory requirements and lower CPU overhead on cache hit than some prior art algorithms.

    SYSTEM AND METHODS OF ZERO-COPY DATA PATH AMONG USER LEVEL PROCESSES

    公开(公告)号:US20200241939A1

    公开(公告)日:2020-07-30

    申请号:US16256713

    申请日:2019-01-24

    Applicant: VMware, Inc.

    Abstract: The disclosure provides an approach for performing an operation by a first process on behalf of a second process, the method comprising: obtaining, by the first process, a memory handle from the second process, wherein the memory handle allows access, by the first process, to at least some of the address space of the second process; dividing the address space of the memory handle into a plurality of sections; receiving, by the first process, a request from the second process to perform an operation; determining, by the first process, a section of the plurality of sections that is to be mapped from the address space of the memory handle to the address space of the first process for the performance of the operation by the first process; mapping the section from the address space of the memory handle to the address space of the first process; and performing the operation by the first process on behalf of the second process.

    OPTIMIZING STORAGE FILE SIZE IN DISTRIBUTED DATA LAKES

    公开(公告)号:US20240248879A1

    公开(公告)日:2024-07-25

    申请号:US18159677

    申请日:2023-01-25

    Applicant: VMware, Inc.

    CPC classification number: G06F16/172 G06F16/122 G06F16/1724

    Abstract: Storage file size in distributed data lakes is optimized. At a first ingestion node of a plurality of ingestion nodes, a merge advisory is received from a coordinator. The merge advisory indicates a transaction identifier (ID). Received data associated with the transaction ID is persisted, which includes: determining whether the received data, persisted together in a single file will exceed a maximum desired file size; based on determining that the maximum desired file size will not be exceeded, persisting the received data in a single file; and based on determining that the maximum desired file size will be exceeded, persisting the received data in a plurality of files that each does not exceed the maximum desired file size. A location of the persisted received data in the permanent storage is identified, by the first ingestion node, to the coordinator.

    TRANSACTION-AWARE TABLE PLACEMENT
    15.
    发明公开

    公开(公告)号:US20240126744A1

    公开(公告)日:2024-04-18

    申请号:US17967286

    申请日:2022-10-17

    Applicant: VMware, Inc.

    CPC classification number: G06F16/2379 G06F16/2282

    Abstract: Intelligent, transaction-aware table placement minimizes cross-host transactions while supporting full transactional semantics and delivering high throughput at low resource utilization. This placement reducing delays caused by cross-host transaction coordination. Examples determine a count of historical interactions between tables, based on at least a transaction history for a plurality of cross-table transactions. Each table provides an abstraction for data, such as by identifying data objects stored in a data lake. For tables on different hosts, having high count of historical interactions, potential cost savings achievable by moving operational control of a first table to the same host as the second table is compared with the potential cost savings achievable by moving operational control of the second table to the same host as the first table. Based on comparing the relative cost savings, one of the tables may be selected. Operational control of the selected table is moved without moving any of the data objects.

    VERSION CONTROL INTERFACE FOR ACCESSING DATA LAKES

    公开(公告)号:US20230205757A1

    公开(公告)日:2023-06-29

    申请号:US17564206

    申请日:2021-12-28

    Applicant: VMware, Inc.

    CPC classification number: G06F16/2379 G06F16/2246

    Abstract: A version control interface for data provides a layer of abstraction that permits multiple readers and writers to access data lakes concurrently. An overlay file system, based on a data structure such as a tree, is used on top of one or more underlying storage instances to implement the interface. Each tree node tree is identified and accessed by means of any universally unique identifiers. Copy-on-write with the tree data structure implements snapshots of the overlay file system. The snapshots support a long-lived master branch, with point-in-time snapshots of its history, and one or more short-lived private branches. As data objects are written to the data lake, the private branch corresponding to a writer is updated. The private branches are merged back into the master branch using any merging logic, and conflict resolution policies are implemented. Readers read from the updated master branch or from any of the private branches.

    SYSTEMS AND METHODS FOR PERFORMING SCALABLE LOG-STRUCTURED MERGE (LSM) TREE COMPACTION USING SHARDING

    公开(公告)号:US20200183905A1

    公开(公告)日:2020-06-11

    申请号:US16212550

    申请日:2018-12-06

    Applicant: VMware, Inc.

    Abstract: Certain aspects provide systems and methods of compacting data within a log-structured merge tree (LSM tree) using sharding. In certain aspects, a method includes determining a size of the LSM tree, determining a compaction time for a compaction of the LSM tree based on the size, determining a number of compaction entities for performing the compaction in parallel based on the compaction time, determining a number of shards based on the number of compaction entities, and determining a key range associated with the LSM tree. The method further comprises dividing the key range by the number of shards into a number of sub key ranges, wherein each of the number of sub key ranges corresponds to a shard of the number of shards and assigning the number of shards to the number of compaction entities for compaction.

Patent Agency Ranking