Invention Grant
- Patent Title: Post-hoc management of datasets
-
Application No.: US15480971Application Date: 2017-04-06
-
Publication No.: US10417439B2Publication Date: 2019-09-17
- Inventor: Philip Korn , Steven Euijong Whang , Natalya Fridman Noy , Sudip Roy , Neoklis Polyzotis , Alon Yitzchak Halevy , Christopher Olston
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Agency: Fish & Richardson P.C.
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06F21/62 ; G06F16/21 ; G06F16/215

Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a catalog for multiple datasets, the method comprising accessing multiple extant data sets, the extant data sets including data sets that are independently generated and structurally dissimilar; organizing the data sets into collections, each data set in each collection belonging to the collection based on collection data associated with the data set; for each collection of data sets: determining, from a subset of the data sets that belong to the collection, metadata that describe the data sets that belong to the collection, wherein the metadata does not include the collection data, and attributing, to other data sets in the collection, the metadata determined from the subset of data sets; and generating, from the collections of data sets and the determined metadata, a catalog for the multiple datasets.
Public/Granted literature
- US20170293671A1 POST-HOC MANAGEMENT OF DATASETS Public/Granted day:2017-10-12
Information query