Patent search ap:("INTERNATIONAL BUSINESS MACHINES CORPORATION") AND inv:"Peter Zhong" Page 1

1.

发明授权
Automated generation of structured training data from unstructured documents 有权

公开(公告)号：US11244203B2

公开(公告)日：2022-02-08

申请号：US16784726

申请日：2020-02-07

Applicant: International Business Machines Corporation

Inventor： Peter Zhong , Antonio Jose Jimeno Yepes , Jianbin Tang

IPC: G06K9/62 , G06F16/332 , G06F16/35 , G06F40/205 , G06K9/32 , G06F16/93

Abstract: Methods, systems and computer program products for automatically generating structured training data based on an unstructured document are provided. Aspects include receiving an unstructured document and a corresponding structured document that includes labeled portions. Aspects also include generating a parsed document that has one or more extracted objects by applying a parsing tool to the unstructured document. Aspects also include identifying one or more matching extracted objects by applying a matching algorithm to the structured document and the parsed document. Each matching extracted object is an extracted object of the parsed document that corresponds to a labeled portion of the structured document. Aspects also include annotating a region of the unstructured document that corresponds to the bounding box of the respective matching extracted object with a respective label of the corresponding labeled portion of the unstructured document.

2.

发明申请
PERSONALIZED OPTICS-FREE VISION CORRECTION 有权

公开(公告)号：US20210049982A1

公开(公告)日：2021-02-18

申请号：US16541214

申请日：2019-08-15

Applicant: International Business Machines Corporation

Inventor： Elaheh ShafieiBavani , Peter Zhong , Rahil Garnavi , Michael Raghib

IPC: G09G5/37

Abstract: A user profile associated with a first user is received. A user prescription associated with the first user is received. A historical interaction of the first user with a display is received. A global vision model is received. One or more display setts to be used on the display is determined based on at least the user profile, the user prescription, the global vision model, and the historical interaction.

3.

发明授权
Document access control based on document component layouts 有权

公开(公告)号：US11734445B2

公开(公告)日：2023-08-22

申请号：US17109454

申请日：2020-12-02

Applicant: International Business Machines Corporation

Inventor： Peter Zhong , Antonio Jose Jimeno Yepes , Lenin Mehedy

IPC: G06F21/00 , G06F21/62 , G06V30/414

CPC classification number: G06F21/6227 , G06V30/414

Abstract: In an approach for providing a document access control based on document component layouts, a processor detects a layout of a document, the layout including one or more components of the document. A processor defines an access policy to access the one or more components based on the layout. A processor authorizes a request to access the one or more components based on the access policy and the layout. A processor retrieves the one or more components based on the access policy and the authorized request.

4.

发明授权
Automatic delineation and extraction of tabular data using machine learning 有权

公开(公告)号：US11380116B2

公开(公告)日：2022-07-05

申请号：US16659977

申请日：2019-10-22

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Peter Zhong , Antonio Jose Jimeno Yepes , Elaheh Shafieibavani

IPC: G06V30/414 , G06N3/04 , G06N20/00

Abstract: A computer-implemented method for using a machine learning model to automatically extract tabular data from an image includes receiving a set of images of tabular data and a set of markup data corresponding respectively to the images of tabular data. The method further includes training a first neural network to delineate the tabular data into cells using the markup data, and training a second neural network to determine content of the cells in the tabular data using the markup data. The method further includes, upon receiving an input image containing a first tabular data without any markup data, generating an electronic output corresponding to the first tabular data by determining the structure of the first tabular data using the first neural network and extracting content of the first tabular data using the second neural network.

5.

发明授权
Personalized optics-free vision correction 有权

公开(公告)号：US11222615B2

公开(公告)日：2022-01-11

申请号：US16541214

申请日：2019-08-15

Applicant: International Business Machines Corporation

Inventor： Elaheh ShafieiBavani , Peter Zhong , Rahil Garnavi , Michael Raghib

IPC: G09G5/37

Abstract: A user profile associated with a first user is received. A user prescription associated with the first user is received. A historical interaction of the first user with a display is received. A global vision model is received. One or more display sets to be used on the display is determined based on at least the user profile, the user prescription, the global vision model, and the historical interaction.

6.

发明授权
Automatic delineation and extraction of tabular data in portable document format using graph neural networks 有权

公开(公告)号：US11599711B2

公开(公告)日：2023-03-07

申请号：US17111392

申请日：2020-12-03

Applicant: International Business Machines Corporation

Inventor： Peter Zhong , Antonio Jose Jimeno Yepes

IPC: G06F40/157 , G06N3/04 , G06N3/08 , G06V30/413 , G06V30/414

Abstract: Aspects of the present invention disclose a method for automatic delineation and extraction of tabular data in portable document format (PDF). The method includes one or more processors extracting metadata corresponding to tabular data in a text-based portable document format (PDF), wherein the metadata is associated with characters and border lines of the tabular data. The method further includes generating a graph structure corresponding to the tabular data in the text-based PDF based at least in part on the metadata. The method further includes generating a vector representation of the graph structure. The method further includes constructing a tree structure corresponding to the tabular data based at least in part on the vector representation.

7.

发明申请
DOCUMENT ACCESS CONTROL BASED ON DOCUMENT COMPONENT LAYOUTS 有权

公开(公告)号：US20220171871A1

公开(公告)日：2022-06-02

申请号：US17109454

申请日：2020-12-02

Applicant: International Business Machines Corporation

Inventor： Peter Zhong , Antinio Jose Jimeno Yepes , Lenin Mehedy

IPC: G06F21/62 , G06K9/00

Abstract: In an approach for providing a document access control based on document component layouts, a processor detects a layout of a document, the layout including one or more components of the document. A processor defines an access policy to access the one or more components based on the layout. A processor authorizes a request to access the one or more components based on the access policy and the layout. A processor retrieves the one or more components based on the access policy and the authorized request.

8.

发明申请
MULTI-MODEL, MULTI-TASK TRAINED NEURAL NETWORK FOR ANALYZING UNSTRUCTURED AND SEMI-STRUCTURED ELECTRONIC DOCUMENTS 有权

公开(公告)号：US20210286989A1

公开(公告)日：2021-09-16

申请号：US16815391

申请日：2020-03-11

Applicant: International Business Machines Corporation

Inventor： Peter Zhong , Antonio Jose Jimeno Yepes , Elaheh ShafieiBavani

IPC: G06K9/00 , G06F16/93 , G06N20/00 , G06N5/04 , G06F40/205

Abstract: Embodiments of the invention describe a computer-implemented method of analyzing an electronic version of a document. The computer-implemented method can include an architecture of machine learning sub-models that performs the global task of translating unstructured and semi-structured inputs into numerical representations that can be recognized and manipulated by a content-analysis (CA) sub-model without relying on brute force analysis. Embodiments of the invention achieve these results by separating the global task into auxiliary tasks and assigning each sub-model to at least one of the auxiliary tasks. The auxiliary tasks can include parsing the unstructured or semi-structured inputs into format types (e.g., lists, tables, figures, text, etc. of a PDF document), extracting features of the parsed document, and performing a computer-based CA on the extracted features. The sub-models are trained in stages and in groups, wherein both the stages and the groupings are based on the complexity of the sub-model's assigned task.

9.

发明申请
AUTOMATIC DELINEATION AND EXTRACTION OF TABULAR DATA IN PORTABLE DOCUMENT FORMAT USING GRAPH NEURAL NETWORKS 有权

公开(公告)号：US20220180044A1

公开(公告)日：2022-06-09

申请号：US17111392

申请日：2020-12-03

Applicant: International Business Machines Corporation

Inventor： Peter Zhong , Antonio Jose Jimeno Yepes

IPC: G06F40/157 , G06K9/00 , G06N3/08 , G06N3/04

Abstract: Aspects of the present invention disclose a method for automatic delineation and extraction of tabular data in portable document format (PDF). The method includes one or more processors extracting metadata corresponding to tabular data in a text-based portable document format (PDF), wherein the metadata is associated with characters and border lines of the tabular data. The method further includes generating a graph structure corresponding to the tabular data in the text-based PDF based at least in part on the metadata. The method further includes generating a vector representation of the graph structure. The method further includes constructing a tree structure corresponding to the tabular data based at least in part on the vector representation.

10.

发明申请
AUTOMATED GENERATION OF STRUCTURED TRAINING DATA FROM UNSTRUCTURED DOCUMENTS 有权

公开(公告)号：US20210248420A1

公开(公告)日：2021-08-12

申请号：US16784726

申请日：2020-02-07

Applicant: International Business Machines Corporation

Inventor： Peter Zhong , Antonio Jose Jimeno Yepes , Jianbin Tang

IPC: G06K9/62 , G06F16/332 , G06F16/35 , G06F40/205 , G06F16/93 , G06K9/32

Abstract: Methods, systems and computer program products for automatically generating structured training data based on an unstructured document are provided. Aspects include receiving an unstructured document and a corresponding structured document that includes labeled portions. Aspects also include generating a parsed document that has one or more extracted objects by applying a parsing tool to the unstructured document. Aspects also include identifying one or more matching extracted objects by applying a matching algorithm to the structured document and the parsed document. Each matching extracted object is an extracted object of the parsed document that corresponds to a labeled portion of the structured document. Aspects also include annotating a region of the unstructured document that corresponds to the bounding box of the respective matching extracted object with a respective label of the corresponding labeled portion of the unstructured document.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification