-
公开(公告)号:US20230385265A1
公开(公告)日:2023-11-30
申请号:US17827795
申请日:2022-05-30
Applicant: VMware, Inc.
Inventor: Christos KARAMANOLIS , Abhishek GUPTA , Richard P. SPILLANE , Marin NOZHCHEV
CPC classification number: G06F16/2365 , G06F16/2282 , G06F11/1435
Abstract: A version control interface provides for accessing a data lake with transactional semantics. Examples generate a plurality of tables for data objects stored in the data lake. The tables each comprise a set of name fields and map a space of columns or rows to a set of the data objects. Transactions read and write data objects and may span a plurality of tables with properties of atomicity, consistency, isolation, durability (ACID). Performing the transaction comprises: accumulating transaction-incomplete messages, indicating that the transaction is incomplete, until a transaction-complete message is received, indicating that the transaction is complete. Upon this occurring, a master branch is updated to reference the data objects according to the transaction-incomplete messages and the transaction-complete message. Tables may be grouped into data groups that provide atomicity boundaries so that different groups may be served by different master branches, thereby improving the speed of master branch updates.
-
公开(公告)号:US20210326049A1
公开(公告)日:2021-10-21
申请号:US16853623
申请日:2020-04-20
Applicant: VMware, Inc.
Inventor: Ye ZHANG , Wenguang WANG , Sriram PATIL , Richard P. SPILLANE , Junlong GAO , Wangping HE , Zhaohui GUO , Yang YANG
Abstract: System and method for writing updated versions of a configuration data file for a distributed file system in a storage system uses a directory renaming operation to write a new updated version of the configuration data file using the latest version of the configuration data file and a target directory. After the latest version of the configuration data file is modified by a particular host computer in the storage system, the modified configuration data file is written to a temporary file. The directory naming operation is then initiated on the temporary file to change the directory for the temporary file to the target directory. If the directory renaming operation has failed, a retry is performed by the particular host computer to write the new updated version of the configuration data file using a new latest version and a new target directory.
-
公开(公告)号:US20210064580A1
公开(公告)日:2021-03-04
申请号:US16552965
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Junlong GAO , Wenguang WANG , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/174 , G06F16/14 , G06F16/901
Abstract: The disclosure provides techniques for deduplicating files. The techniques include, upon creating or modifying a file, placing a logical timestamp of the current logical time, within a queue associated with the directory of the file. The techniques further include placing the logical timestamp within a queue of each parent directory of the directory of the file. To determine a set of files for deduplication, the techniques disclosed herein identify files that have been modified within a logical time range. The set of files modified within a logical time is identified by traversing directories of a storage system, the directories being organized within a tree structure. If a directory's queue does not contain a timestamp that is within the logical time range, then all child directories can be skipped over for further processing, such that no files within the child directories end up being within the set of files for deduplication.
-
公开(公告)号:US20240248905A1
公开(公告)日:2024-07-25
申请号:US18159667
申请日:2023-01-25
Applicant: VMware, Inc.
Inventor: Dimiter DIMITRIEV , Kostadin GEORGIEV , Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE
IPC: G06F16/25
CPC classification number: G06F16/254
Abstract: References to changing data sets in distributed data lakes are optimized. As part of a transaction, a first message is received. The first message identifies a table and first data to be written to the table. Based on at least the table, the first message is routed to a first ingestion node of a plurality of ingestion nodes. The first data is persisted in temporary storage. Location information of the persisted first data is determined. A data available message comprising a self-describing reference to the first data is published, by the first ingestion node, to a first reader node of a plurality of reader nodes. The self-describing reference identifies the first ingestion node, the location information of the first data, and a range of the first data.
-
公开(公告)号:US20220292061A1
公开(公告)日:2022-09-15
申请号:US17202342
申请日:2021-03-15
Applicant: VMware, Inc.
Inventor: Abhay Kumar JAIN , Wenguang WANG , Richard P. SPILLANE
Abstract: Optimizing file access includes a process for identifying a file access event for a first accessed file, and incrementing a first access counter in an access list in a memory, which also includes access counters for other accessed files. The process further includes exporting the first access counter to a performance monitoring dashboard, or exporting to a storage allocator and, based on the value, moving the first accessed file between a first storage and a second storage. The process also includes determining whether the value of the first access counter meets a first threshold, or a sum of values of the access counters for the other accessed files meets a second threshold. Based on meeting the first threshold or meeting the second threshold, the process includes persisting the access counters on a storage media. The access counters also provide security monitoring (e.g., identifying excessive file access).
-
6.
公开(公告)号:US20210064582A1
公开(公告)日:2021-03-04
申请号:US16552998
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/174 , G06F16/13 , G06F16/172
Abstract: The present disclosure provides techniques for deduplicating files. The techniques include creating a data structure that organizes metadata about chunks of files, the organization of the metadata preserving order and locality of the chunks within files. The organization of the metadata within storage blocks of storage devices matches the order of chunks within files. Upon a read or write operation to a metadata, the preservation of locality of metadata results in the likely fetching, from storage into a memory cache, metadata of subsequent and contiguous chunks. The preserved locality results in faster subsequent read and write operations of metadata, because the read and write operations are likely to be executed from memory rather than from storage.
-
公开(公告)号:US20200233801A1
公开(公告)日:2020-07-23
申请号:US16252488
申请日:2019-01-18
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Robert T. JOHNSON , Richard P. SPILLANE , Sandeep RANGASWAMY , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Srinath PREMACHANDRAN
IPC: G06F12/0804 , G06F16/22 , G06F16/2455 , G06F7/16
Abstract: Certain aspects provide systems and methods for performing an operation on a Bε-tree. A method comprises writing a message associated with the operation to a first slot in a first buffer of a first non-leaf node of the Bε-tree in an append-only manner, wherein a first filter associated with the first slot is used for query operations associated with the first slot. The method further comprises determining that the first buffer is full and, upon determining to flush the message to a non-leaf child node, flushing the message in an append-only manner to a second slot in a second buffer of the non-leaf child node, wherein a second filter associated with the second slot is used for query operations associated with the second slot. The method further comprises, upon determining to flush the message to a leaf node, flushing the message to the leaf node in a sorted manner.
-
公开(公告)号:US20190294709A1
公开(公告)日:2019-09-26
申请号:US15927019
申请日:2018-03-20
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Rob T. JOHNSON , Srinath PREMACHANDRAN , Richard P. SPILLANE , Sandeep RANGASWAMY , Jorge GUERRA DELGADO , Kapil CHOWKSEY , Wenguang WANG
IPC: G06F17/30
Abstract: Exemplary methods, apparatuses, and systems include a file system process inserting a first key/value pair and a second key/value pair into a first tree. The second key is a duplicate of the first key and the value of the second key/value pair is an operation changing the value. In response to a request for a range of key/value pairs, the process reads the second key/value pair and inserts it in a second tree. The process reads the first pair and determines, while inserting the first pair in the second tree, that the second key is a duplicate of the first key. The file system process determines an updated value of the first value by applying the operation in the second value to first value. The file system operation updates the second key/value pair in the second tree with the updated value and returns the requested range of key/value pairs.
-
公开(公告)号:US20210334178A1
公开(公告)日:2021-10-28
申请号:US16859944
申请日:2020-04-27
Applicant: VMware, Inc.
Inventor: Yang YANG , Ye ZHANG , Xiang YU , Wenguang WANG , Richard P. SPILLANE , Sriram PATIL
Abstract: System and method for automatic remediation for a distributed file system uses a file system (FS) remediation module running in a cluster management server and FS remediation agents running in a cluster of host computers. The FS remediation module monitors the cluster of host computers for related events. When a first file system service (FSS)-impacting event is detected, a cluster-level remediation action is executed at the cluster management server by the FS remediation module in response to the detected first FSS-impacting event. When a second FSS-impacting event is detected, a host-level remediation action is executed at one or more of the host computers in the cluster by the FS remediation agents in response to the detected second FSS-impacting event.
-
公开(公告)号:US20210141728A1
公开(公告)日:2021-05-13
申请号:US16679570
申请日:2019-11-11
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Mounesh BADIGER , Abhay Kumar JAIN , Junlong GAO , Zhaohui GUO , Richard P. SPILLANE
IPC: G06F12/0842 , G06F12/0844 , G06F12/0871 , G06F12/1018 , G06F12/14
Abstract: Disclosed are a method and system for managing multi-threaded concurrent access to a cache data structure. The cache data structure includes a hash table and three queues. The hash table includes a list of elements for each hash bucket with each hash bucket containing a mutex object and elements in each of the queues containing lock objects. Multiple threads can each lock a different hash bucket to have access to the list, and multiple threads can each lock a different element in the queues. The locks permit highly concurrent access to the cache data structure without conflict. Also, atomic operations are used to obtain pointers to elements in the queues so that a thread can safely advance each pointer. Race conditions that are encountered with locking an element in the queues or entering an element into the hash table are detected, and the operation encountering the race condition is retried.
-
-
-
-
-
-
-
-
-