Invention Grant
US08682881B1 System and method for extracting structured data from classified websites 有权
从分类网站提取结构化数据的系统和方法

System and method for extracting structured data from classified websites
Abstract:
Systems, methods, and computer readable storage mediums are provided for automatically extracting data from a classified website. A website is determined to be a classified website based on a set of heuristics. Then page models for other classified websites are accessed. The page models may include listing page models, detail page models, and/or city page models. A listing page in the classified website is determined based on similarity of the listing page to the page models for the other classified websites. Then a listing page model for the listing page in the classified website is created. After the model has been created data from the classified website is extracted based at least in part on the listing page model. Similar processes are performed for determining a details page, creating a details page model, and extracting data from the classified website using a details page model.
Information query
Patent Agency Ranking
0/0