Invention Grant
US08856129B2 Flexible and scalable structured web data extraction 有权
灵活和可扩展的结构化网络数据提取

Flexible and scalable structured web data extraction
Abstract:
This document describes techniques that label text nodes of a seed site for each of a plurality of verticals. Once a seed site is labeled for a given vertical, the techniques extract features from the labeled text nodes of the seed site. The techniques learn vertical knowledge for the seed site based on the human labels and the extracted features, and adapt the learned vertical knowledge to a new web site to automatically and accurately identify attributes and extract attribute values targeted within a given vertical for structured web data extraction.
Public/Granted literature
Information query
Patent Agency Ranking
0/0