-
公开(公告)号:US11914741B2
公开(公告)日:2024-02-27
申请号:US17444245
申请日:2021-08-02
发明人: Samuel Szuflita , Alice Yu , Emily Wang , Hao Dang , Megha Arora , Nicholas Gates , Samuel Rogerson
IPC分类号: G06F21/62 , G06F21/60 , G06F16/901 , G06F16/36 , G06F16/903
CPC分类号: G06F21/6227 , G06F16/367 , G06F16/9024 , G06F16/90344 , G06F21/604 , G06F21/6245 , G06F2221/2141
摘要: A computer system is configured to receiving a data set from a data provider and automatically save the data set in a quarantine database where copying, moving, and sharing of the data set are restricted until the data set is released by a data provider. The data set is parsed to find and mark portions with potentially sensitive information. At least those parts are reviewed by a data governor, who can confirm, add, edit, or remove markers. Those parts can be visually indicated to the data governor, along with a preview of, metadata about, and analysis of the data set. After reviewing at least the automatically marked portions, the data governor can release the data set to a non-quarantine database where another user can use the data set. The user is restricted from accessing the quarantine database.
-
公开(公告)号:US20210365581A1
公开(公告)日:2021-11-25
申请号:US17444245
申请日:2021-08-02
发明人: Samuel Szuflita , Alice Yu , Emily Wang , Hao Dang , Megha Arora , Nicholas Gates , Samuel Rogerson
IPC分类号: G06F21/62 , G06F16/36 , G06F16/903 , G06F21/60 , G06F16/901
摘要: A computer system is configured to receiving a data set from a data provider and automatically save the data set in a quarantine database where copying, moving, and sharing of the data set are restricted until the data set is released by a data provider. The data set is parsed to find and mark portions with potentially sensitive information. At least those parts are reviewed by a data governor, who can confirm, add, edit, or remove markers. Those parts can be visually indicated to the data governor, along with a preview of, metadata about, and analysis of the data set. After reviewing at least the automatically marked portions, the data governor can release the data set to a non-quarantine database where another user can use the data set. The user is restricted from accessing the quarantine database.
-
公开(公告)号:US11521100B1
公开(公告)日:2022-12-06
申请号:US16667820
申请日:2019-10-29
发明人: Megha Arora , Samuel Szuflita , Hao Dang , Mihir Patil , Yeong Wei Wee , Alice Yu
摘要: Systems and methods are provided for processing an input dataset or running an inference. The systems and methods may be configured to accept an input dataset, access one or more predefined logic plugins for processing the input dataset, process the input dataset based at least in part on a first predefined logic plugin, and generate the one or more outputs based at least in part of the processing of the input dataset. The one or more outputs may have a different format than a format of the input dataset.
-
公开(公告)号:US11093634B1
公开(公告)日:2021-08-17
申请号:US16219504
申请日:2018-12-13
发明人: Samuel Szuflita , Alice Yu , Emily Wang , Hao Dang , Megha Arora , Nicholas Gates , Samuel Rogerson
IPC分类号: G06F16/901 , G06F21/62 , G06F21/60 , G06F16/36 , G06F16/903
摘要: A computer system is configured to receiving a data set from a data provider and automatically save the data set in a quarantine database where copying, moving, and sharing of the data set are restricted until the data set is released by a data provider. The data set is parsed to find and mark portions with potentially sensitive information. At least those parts are reviewed by a data governor, who can confirm, add, edit, or remove markers. Those parts can be visually indicated to the data governor, along with a preview of, metadata about, and analysis of the data set. After reviewing at least the automatically marked portions, the data governor can release the data set to a non-quarantine database where another user can use the data set. The user is restricted from accessing the quarantine database.
-
公开(公告)号:US10176217B1
公开(公告)日:2019-01-08
申请号:US15698574
申请日:2017-09-07
发明人: Hao Dang , Gustav Brodman , Yi Xue , Stacey Milspaw , Yifei Huang , Yanran Lu
摘要: Techniques for automatically scheduling builds of derived datasets in a distributed database system that supports pipelined data transformations are described herein. In an embodiment, a data processing method comprises, in association with a distributed database system that implements one or more data transformation pipelines, each of the data transformation pipelines comprising at least a first dataset, a first transformation, a second derived dataset and dataset dependency and timing metadata, detecting an arrival of a new raw dataset or new derived dataset; in response to the detecting, obtaining from the dataset dependency and timing metadata a dataset subset comprising those datasets that depend on at least the new raw dataset or new derived dataset; for each member dataset in the dataset subset, determining if the member dataset has a dependency on any other dataset that is not yet arrived, and in response to determining that the member dataset does not have a dependency on any other dataset that is not yet arrived: initiating a build of a portion of the data transformation pipeline comprising the member dataset and all other datasets on which the member dataset is dependent, without waiting for arrival of other datasets.
-
公开(公告)号:US20190114289A1
公开(公告)日:2019-04-18
申请号:US16208435
申请日:2018-12-03
发明人: Hao Dang , Gustav Brodman , Yi Xue , Stacey Milspaw , Yifei Huang , Yanran Lu
IPC分类号: G06F16/182 , G06F9/455
摘要: Techniques for automatically scheduling builds of derived datasets in a distributed database system that supports pipelined data transformations are described herein. In an embodiment, a data processing method comprises, in association with a distributed database system that implements one or more data transformation pipelines, each of the data transformation pipelines comprising at least a first dataset, a first transformation, a second derived dataset and dataset dependency and timing metadata, detecting an arrival of a new raw dataset or new derived dataset; in response to the detecting, obtaining from the dataset dependency and timing metadata a dataset subset comprising those datasets that depend on at least the new raw dataset or new derived dataset; for each member dataset in the dataset subset, determining if the member dataset has a dependency on any other dataset that is not yet arrived, and in response to determining that the member dataset does not have a dependency on any other dataset that is not yet arrived: initiating a build of a portion of the data transformation pipeline comprising the member dataset and all other datasets on which the member dataset is dependent, without waiting for arrival of other datasets.
-
公开(公告)号:US20240220505A1
公开(公告)日:2024-07-04
申请号:US18396944
申请日:2023-12-27
发明人: Aditya Shashi , Gregory DeArment , Hao Dang , Philip Bale , Richard Niemi , Shardool Patel , Takashi Okamoto , Wenshuai Hou
IPC分类号: G06F16/2457 , G06F16/22 , G06F40/40
CPC分类号: G06F16/24573 , G06F16/2246 , G06F40/40
摘要: Methods and systems for context-aware change management include performing the operations of: receiving a change request for a software service, the change request comprising change metadata; automatically obtaining context information for the change request; making a decision on the change request based at least in part on the change metadata and the context information; and setting a change status for the change request.
-
公开(公告)号:US11314698B2
公开(公告)日:2022-04-26
申请号:US16208435
申请日:2018-12-03
发明人: Hao Dang , Gustav Brodman , Yi Xue , Stacey Milspaw , Yifei Huang , Yanran Lu
IPC分类号: G06F16/182 , G06F16/2455 , G06F16/25 , G06F16/23 , G06F9/455
摘要: Techniques for automatically scheduling builds of derived datasets in a distributed database system that supports pipelined data transformations are described herein. In an embodiment, a data processing method comprises, in association with a distributed database system that implements one or more data transformation pipelines, each of the data transformation pipelines comprising at least a first dataset, a first transformation, a second derived dataset and dataset dependency and timing metadata, detecting an arrival of a new raw dataset or new derived dataset; in response to the detecting, obtaining from the dataset dependency and timing metadata a dataset subset comprising those datasets that depend on at least the new raw dataset or new derived dataset; for each member dataset in the dataset subset, determining if the member dataset has a dependency on any other dataset that is not yet arrived, and in response to determining that the member dataset does not have a dependency on any other dataset that is not yet arrived: initiating a build of a portion of the data transformation pipeline comprising the member dataset and all other datasets on which the member dataset is dependent, without waiting for arrival of other datasets.
-
-
-
-
-
-
-