Invention Grant
- Patent Title: Business data lake search engine
-
Application No.: US15794387Application Date: 2017-10-26
-
Publication No.: US10795895B1Publication Date: 2020-10-06
- Inventor: Ran Taig , Avitan Gefen , Omer Sagi
- Applicant: EMC IP Holding Company LLC
- Applicant Address: US MA Hopkinton
- Assignee: EMC IP Holding Company LLC
- Current Assignee: EMC IP Holding Company LLC
- Current Assignee Address: US MA Hopkinton
- Agency: Ryan, Mason & Lewis, LLP
- Main IPC: G06F16/00
- IPC: G06F16/00 ; G06F16/2457 ; G06F16/14 ; G06F16/335 ; G06F16/901

Abstract:
Business Data Lake searching techniques are provided. A method comprises obtaining a graph representing tables of the Business Data Lake, where each node represents one table and edges between nodes represent foreign key connections; applying a node rank algorithm to determine a relevancy score of the tables based on a number of links to/from other tables; and, in response to a query: ranking a relevancy of query items based on a term frequency-based score to generate candidate results; extracting a candidate sub-graph based on the following: a top-L tables based on the term frequency-based score, and/or a top-M tables based on a topic model distance score for the given query and candidate items; enriching the extracted candidate sub-graph by adding new tables using an item-to-item collaborative filter where a similarity between two tables is measured based on a number of interactions; and ordering the tables in the enriched sub-graph based on the relevancy score and/or a user-to-item collaborative filter that evaluates past user interactions with prior results.
Information query