Invention Grant
- Patent Title: Flexible and scalable structured web data extraction
- Patent Title (中): 灵活和可扩展的结构化网络数据提取
-
Application No.: US13237142Application Date: 2011-09-20
-
Publication No.: US08856129B2Publication Date: 2014-10-07
- Inventor: Rui Cai , Lei Zhang , Qiang Hao
- Applicant: Rui Cai , Lei Zhang , Qiang Hao
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agency: Lee & Hayes PLLC
- Agent Carole Boelitz; Micky Minhas
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
This document describes techniques that label text nodes of a seed site for each of a plurality of verticals. Once a seed site is labeled for a given vertical, the techniques extract features from the labeled text nodes of the seed site. The techniques learn vertical knowledge for the seed site based on the human labels and the extracted features, and adapt the learned vertical knowledge to a new web site to automatically and accurately identify attributes and extract attribute values targeted within a given vertical for structured web data extraction.
Public/Granted literature
- US20130073514A1 FLEXIBLE AND SCALABLE STRUCTURED WEB DATA EXTRACTION Public/Granted day:2013-03-21
Information query