Invention Grant
- Patent Title: Efficient caching and data access to a remote data lake in a large scale data processing environment
-
Application No.: US17204899Application Date: 2021-03-17
-
Publication No.: US11797447B2Publication Date: 2023-10-24
- Inventor: Xiongbing Ou , Thomas Anthony Phelan , David Lee
- Applicant: Hewlett Packard Enterprise Development LP
- Applicant Address: US TX Houston
- Assignee: Hewlett Packard Enterprise Development LP
- Current Assignee: Hewlett Packard Enterprise Development LP
- Current Assignee Address: US TX Spring
- Agency: Mauriel Kapouytian Woods LLP
- Main IPC: G06F12/0811
- IPC: G06F12/0811 ; G06F16/182 ; G06F12/121 ; G06F9/54

Abstract:
Embodiments described herein are generally directed to caching and data access improvements in a large scale data processing environment. According to an example, an agent running on a first worker node of a cluster receives a read request from a task. The worker node of the cluster to which the data at issue is mapped is identified. When the first worker node is the identified worker node, it is determined whether its cache contains the data; if so, the data is fetched from a remote data lake and the agent locally caches the data; otherwise, when the identified worker node is another worker node of the compute cluster, the data is fetched from a remote agent of that worker node. The agent responds to the read request with cached data, data returned by the remote data lake, or data returned by the remote data agent as the case may be.
Public/Granted literature
- US20220300422A1 EFFICIENT CACHING AND DATA ACCESS TO A REMOTE DATA LAKE IN A LARGE SCALE DATA PROCESSING ENVIRONMENT Public/Granted day:2022-09-22
Information query
IPC分类: