-
公开(公告)号:US20240248905A1
公开(公告)日:2024-07-25
申请号:US18159667
申请日:2023-01-25
Applicant: VMware, Inc.
Inventor: Dimiter DIMITRIEV , Kostadin GEORGIEV , Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE
IPC: G06F16/25
CPC classification number: G06F16/254
Abstract: References to changing data sets in distributed data lakes are optimized. As part of a transaction, a first message is received. The first message identifies a table and first data to be written to the table. Based on at least the table, the first message is routed to a first ingestion node of a plurality of ingestion nodes. The first data is persisted in temporary storage. Location information of the persisted first data is determined. A data available message comprising a self-describing reference to the first data is published, by the first ingestion node, to a first reader node of a plurality of reader nodes. The self-describing reference identifies the first ingestion node, the location information of the first data, and a range of the first data.
-
公开(公告)号:US20240248879A1
公开(公告)日:2024-07-25
申请号:US18159677
申请日:2023-01-25
Applicant: VMware, Inc.
Inventor: Dimiter DIMITRIEV , Kostadin GEORGIEV , Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE
IPC: G06F16/172 , G06F16/11 , G06F16/17
CPC classification number: G06F16/172 , G06F16/122 , G06F16/1724
Abstract: Storage file size in distributed data lakes is optimized. At a first ingestion node of a plurality of ingestion nodes, a merge advisory is received from a coordinator. The merge advisory indicates a transaction identifier (ID). Received data associated with the transaction ID is persisted, which includes: determining whether the received data, persisted together in a single file will exceed a maximum desired file size; based on determining that the maximum desired file size will not be exceeded, persisting the received data in a single file; and based on determining that the maximum desired file size will be exceeded, persisting the received data in a plurality of files that each does not exceed the maximum desired file size. A location of the persisted received data in the permanent storage is identified, by the first ingestion node, to the coordinator.
-
公开(公告)号:US20190236154A1
公开(公告)日:2019-08-01
申请号:US15880774
申请日:2018-01-26
Applicant: VMware, Inc.
Inventor: Kostadin GEORGIEV , Murad MURAD , Deyan POPOV , Grigor HARBALIEV , Lyubomir TZVETKOV
CPC classification number: G06F16/24522 , G06F9/45558 , G06F16/951 , G06F17/2705 , G06F2009/45595
Abstract: One or more embodiments provide techniques for querying a database having information contained therein that is associated with one or more virtual machines executing on a host. A management agent receives a query in a first format from a user device, wherein the first format is not executable in the database. The management agent parses the query to identify one or more commands in the query. The management agent translates the query into a modified query that is executable in the database. The management agent references one or more pre-loaded properties associated with the database. The management agent executes the translated query against the database. The management agent returns results of the execution.
-
-