Patent search ap:("Google Inc.") AND inv:"Jinan Lou" Page 1

1.

发明申请
CLUSTERING COMMUNICATIONS BASED ON CLASSIFICATION 有权
Title translation: 基于分类的聚类通信

公开(公告)号：US20160314182A1

公开(公告)日：2016-10-27

申请号：US14414855

申请日：2014-09-18

Applicant: Google, Inc.

Inventor： Xincheng Zhang , Hui Tan , Zhiyu Wang , Jinan Lou

IPC: G06F17/30 , H04L12/26

CPC classification number: G06F17/30598 , G06F17/30705 , H04L43/04

Abstract: Methods and apparatus related to clustering documents based on one or more classification terms and optionally based on similarity of structural paths of the documents. In some implementations, the documents are communications such as structured emails or other structured communications. In some of those implementations, clustering the communications includes identifying a plurality of classification terms indicative of a classification, identifying a corpus of communications that includes communications that are not labeled with an association to the classification, and determining a cluster of the communications based on occurrence of one or more of the classification terms in the communications of the cluster.

Abstract translation: 基于一个或多个分类术语和可选地基于文档的结构路径的相似性的与聚类文档相关的方法和装置。在一些实现中，文档是诸如结构化电子邮件或其他结构化通信之类的通信。在这些实现中的一些实现中，对通信进行聚类包括识别指示分类的多个分类项，识别包括未标记有与分类的关联的通信的通信语料库，以及基于发生的确定通信集群集群通信中的一个或多个分类术语。

2.

发明申请
GENERATING DATA RECORDS BASED ON PARSING 审中-公开
Title translation: 基于PARSING生成数据记录

公开(公告)号：US20140279864A1

公开(公告)日：2014-09-18

申请号：US14143835

申请日：2013-12-30

Applicant: Google Inc.

Inventor： Mikhail Lopyrev , Gaurav Jain , Bote Deepak Narayan , Vitaly Repeshko , Chengling Chan , Jinan Lou

IPC: G06F17/30

CPC classification number: G06F17/2705 , G06F16/258

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving a first document, the first document being associated with a user, executing a plurality of parsers, each parser of the plurality of parsers processing the first document to provide one or more first data values, merging the one or more first data values provided from the plurality of parsers to populate a data record having one or more data fields, the data record being specific to the user, and storing the data record in computer-readable memory.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的用于接收第一文档的计算机程序，第一文档与用户相关联，执行多个解析器，多个解析器的每个解析器处理第一文档到提供一个或多个第一数据值，合并从多个解析器提供的一个或多个第一数据值，以填充具有一个或多个数据字段的数据记录，该数据记录是用户特有的，并将数据记录存储在计算机中可读内存

3.

发明授权
Template-based structured document classification and extraction 有权

公开(公告)号：US10657158B2

公开(公告)日：2020-05-19

申请号：US15360939

申请日：2016-11-23

Applicant: Google Inc.

Inventor： Ying Sheng , Yifeng Lu , Jing Xie , Jie Yang , Luis Garcia Pueyo , Jinan Lou , James Wendt

IPC: G06F16/00 , G06F16/28 , G06N20/00 , G06F16/93 , G06Q10/10 , G06N20/20 , G06F40/174 , G06F40/186

Abstract: Techniques are described herein for automatically generating data extraction templates for structured documents (e.g., B2C emails, invoices, bills, invitations, etc.), and for assigning classifications to those data extraction templates to streamline data extraction from subsequent structured documents. In various implementations, a data extraction template generated from a cluster of structured documents that share fixed content may be identified. Features of the cluster of structured documents may be applied as input to extraction machine learning model(s) trained to provide location(s) of transient field(s) in structured documents, to determine location(s) of transient field(s) in the cluster of structured documents. An association between the data extraction template and the determined transient field location(s) may be stored. Based on the association, data point(s) may be extracted from a given structured document of a user that shares fixed content with the cluster of structured documents. The extracted data point(s) may be surfaced to the user.

4.

发明申请
IDENTIFYING AN ASSUMPTION ABOUT A USER, AND DETERMINING A VERACITY OF THE ASSUMPTION 审中-公开

公开(公告)号：US20170140022A1

公开(公告)日：2017-05-18

申请号：US14289355

申请日：2014-05-28

Applicant: Google Inc.

Inventor： Jinan Lou , Hongtao Zhong

IPC: G06F17/30 , G06N99/00

CPC classification number: G06F16/284 , G06F16/337 , G06F16/955 , G06N5/025 , G06N20/00

Abstract: Methods, apparatus and computer-readable media (transitory and non-transitory) are disclosed for analyzing a document associated with a user to identify an assumption about the user, comparing the assumption with on one or more signals that are associated with the user and separate from the document to determine a veracity of the assumption, and updating one or more techniques for identifying an assumption based on feedback that is generated based on the veracity.

5.

发明授权
Query-based stream 有权

公开(公告)号：US09600543B1

公开(公告)日：2017-03-21

申请号：US14040466

申请日：2013-09-27

Applicant: Google Inc.

Inventor： Lucian Florin Cionca , Andre Rohe , Yonatan Zunger , Sangsoo Sung , Mohit Oberoi , Daniel Belov , Harish Rajamani , Jinan Lou

IPC: G06F17/30 , G06Q50/10

CPC classification number: G06F17/30554 , G06F17/30867 , G06Q50/10

Abstract: In one aspect, a method includes receiving an indication of a request from a user to view a stream associated with the user, generating a request for one or more items visible to the user for display within the stream, the request including a search query identifying search criteria including one or more tokens, the one or more tokens including at least a user token identifying the user, receiving one or more items in response to the request, the one or more items including at least one of the one or more tokens and further being visible to the user and providing the one or more items for display to the user within the stream in response to the request. Other aspects can be embodied in corresponding systems and apparatus, including computer program products.

6.

发明授权
Generating and applying event data extraction templates 有权

公开(公告)号：US10360537B1

公开(公告)日：2019-07-23

申请号：US15484933

申请日：2017-04-11

Applicant: Google Inc.

Inventor： Mike Bendersky , Maureen Heymans , Jinan Lou , Jie Yang , MyLinh Yang , Amitabh Saikia , Marc-Allen Cartright , Vanja Josifovski , Hui Tan , Luis Garcia Pueyo

IPC: G06F17/30 , G06Q10/10 , G06F16/248 , G06F16/9535 , H04W4/029

Abstract: Techniques are described herein for generating and applying event data extraction templates. In various implementations, a data extraction template may be applied to structured communications to extract, from each structured communication, event data associated with a transient markup language path indicated in the data extraction template. The data extraction template may include an event-related semantic data type assigned to the transient markup language path and a strength of association between the transient structural path and the event-related semantic data type. Feedback may be obtained concerning event data extracted from one or more of the structured communications. Based on the feedback, the strength of association between the transient markup language path and the event-related semantic data type may be altered. The data extraction template may then be applied to a subsequent structured communication to extract new event data from the structured communication based on the altered strength of association.

7.

发明授权
Clustering communications based on classification 有权

公开(公告)号：US10007717B2

公开(公告)日：2018-06-26

申请号：US14414855

申请日：2014-09-18

Applicant: Google Inc.

Inventor： Xincheng Zhang , Hui Tan , Zhiyu Wang , Jinan Lou

IPC: G06F17/30 , H04L12/26

CPC classification number: G06F16/285 , G06F16/35 , H04L43/04

Abstract: Methods and apparatus related to clustering documents based on one or more classification terms and optionally based on similarity of structural paths of the documents. In some implementations, the documents are communications such as structured emails or other structured communications. In some of those implementations, clustering the communications includes identifying a plurality of classification terms indicative of a classification, identifying a corpus of communications that includes communications that are not labeled with an association to the classification, and determining a cluster of the communications based on occurrence of one or more of the classification terms in the communications of the cluster.

8.

发明申请
TEMPLATE-BASED STRUCTURED DOCUMENT CLASSIFICATION AND EXTRACTION 审中-公开

公开(公告)号：US20180144042A1

公开(公告)日：2018-05-24

申请号：US15360939

申请日：2016-11-23

Applicant: Google Inc.

Inventor： Ying Sheng , Yifeng Lu , Jing Xie , Jie Yang , Luis Garcia Pueyo , Jinan Lou , James Wendt

IPC: G06F17/30 , G06F17/24 , G06N99/00

CPC classification number: G06F16/285 , G06F16/93 , G06F17/243 , G06F17/248 , G06N20/00 , G06N20/20 , G06Q10/10

Abstract: Techniques are described herein for automatically generating data extraction templates for structured documents (e.g., B2C emails, invoices, bills, invitations, etc.), and for assigning classifications to those data extraction templates to streamline data extraction from subsequent structured documents. In various implementations, a data extraction template generated from a cluster of structured documents that share fixed content may be identified. Features of the cluster of structured documents may be applied as input to extraction machine learning model(s) trained to provide location(s) of transient field(s) in structured documents, to determine location(s) of transient field(s) in the cluster of structured documents. An association between the data extraction template and the determined transient field location(s) may be stored. Based on the association, data point(s) may be extracted from a given structured document of a user that shares fixed content with the cluster of structured documents. The extracted data point(s) may be surfaced to the user.

9.

发明授权
Generating and applying event data extraction templates 有权

公开(公告)号：US09652530B1

公开(公告)日：2017-05-16

申请号：US14470416

申请日：2014-08-27

Applicant: Google Inc.

Inventor： Mike Bendersky , Maureen Heymans , Jinan Lou , Jie Yang , MyLinh Yang , Amitabh Saikia , Marc-Allen Cartright , Vanja Josifovski , Hui Tan , Luis Garcia Pueyo

IPC: G06F17/30

CPC classification number: G06F17/30705 , G06F17/30923

Abstract: Methods and apparatus are described herein for generating and applying event data extraction templates. In various implementations, a set of structural paths may be identified from a corpus of communications. A first structural path of the set of structural paths, associated with a first segment of text, may be classified as transient in response to a determination that a frequency of occurrences of the first segment of text across the corpus satisfies a criterion. Event heuristics may be applied to the communications of the corpus. A determination may be made, based on the applying, that the communications of the corpus are event-related. An event data type may be assigned to the transient structural path based on the applying. An event data extraction template may be generated to extract, from one or more subsequent communications, one or more event-related segments of text associated with the transient structural path.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification