-
公开(公告)号:US09672251B1
公开(公告)日:2017-06-06
申请号:US14499615
申请日:2014-09-29
Applicant: Google Inc.
Inventor: Steven Euijong Whang , Rahul Gupta , Alon Yitzchak Halevy , Mohamed Yahya
IPC: G06F17/30
CPC classification number: G06F17/30528 , G06F17/30616 , G06N5/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for extracting facts from a collection of documents. One of the methods includes obtaining a plurality of seed facts; generating a plurality of patterns from the seed facts, wherein each of the plurality of patterns is a dependency pattern generated from a dependency parse; applying the patterns to documents in a collection of documents to extract a plurality of candidate additional facts from the collection of documents; and selecting one or more additional facts from the plurality of candidate additional facts.