Invention Grant
- Patent Title: Transferable neural architecture for structured data extraction from web documents
-
Application No.: US17792788Application Date: 2020-01-29
-
Publication No.: US11886533B2Publication Date: 2024-01-30
- Inventor: Ying Sheng , Yuchen Lin , Sandeep Tata , Nguyen Vo
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Agency: Botos Churchill IP Law
- International Application: PCT/US2020/015602 2020.01.29
- International Announcement: WO2021/154238A 2021.08.05
- Date entered country: 2022-07-14
- Main IPC: G06F16/958
- IPC: G06F16/958 ; G06F16/957 ; G06F40/14

Abstract:
Systems and methods for efficiently identifying and extracting machine-actionable structured data from web documents are provided. The technology employs neural network architectures which process the raw HTML content of a set of seed websites to create transferable models regarding information of interest. These models can then be applied to the raw HTML of other websites to identify similar information of interest. Data can thus be extracted across multiple websites in a functional, structured form that allows it to be used further by a processing system.
Public/Granted literature
- US20230014465A1 A Transferable Neural Architecture for Structured Data Extraction From Web Documents Public/Granted day:2023-01-19
Information query