Invention Grant
- Patent Title: Self-learning based crawling and rule-based data mining for automatic information extraction
-
Application No.: US15077563Application Date: 2016-03-22
-
Publication No.: US10762437B2Publication Date: 2020-09-01
- Inventor: Arun Kumar A V , Hemant Kumar Rath , Shameemraj M Nadaf , Anantha Simha
- Applicant: Tata Consultancy Services Limited
- Applicant Address: IN Mumbai
- Assignee: TATA CONSULTANCY SERVICES LIMITED
- Current Assignee: TATA CONSULTANCY SERVICES LIMITED
- Current Assignee Address: IN Mumbai
- Agency: Finnegan, Henderson, Farabow, Garrett & Dunner LLP
- Priority: com.zzzhc.datahub.patent.etl.us.BibliographicData$PriorityClaim@6c5ec6d9
- Main IPC: G06N20/00
- IPC: G06N20/00 ; G06F16/951 ; G06F16/95 ; G06N5/04

Abstract:
Methods and Systems for automatic information extraction by performing self-learning crawling and rule-based data mining is provided. The method determines existence of crawl policy within input information and performs at least one of front-end crawling, assisted crawling and recursive crawling. Downloaded data set is pre-processed to remove noisy data and subjected to classification rules and decision tree based data mining to extract meaningful information. Performing crawling techniques leads to smaller relevant datasets pertaining to a specific domain from multi-dimensional datasets available in online and offline sources.
Public/Granted literature
- US20160371603A1 SELF-LEARNING BASED CRAWLING AND RULE-BASED DATA MINING FOR AUTOMATIC INFORMATION EXTRACTION Public/Granted day:2016-12-22
Information query