-
1.
公开(公告)号:US20200226325A1
公开(公告)日:2020-07-16
申请号:US16740066
申请日:2020-01-10
Applicant: Chevron U.S.A. Inc. , CALIFORNIA INSTITUTE OF TECHNOLOGY
Inventor: Asitang Mishra , Shuxing Cheng , Annie Didier , Chris Mattmann , Hamsa Shwetha Venkataram , Grant Lee , Wayne Moses Burke , Vishal Lall
IPC: G06F40/295 , G06N20/00 , G06F40/284 , G06F40/205
Abstract: A computer-implemented, machine learning-based method of converting an unstructured technical report into a structured technical report includes obtaining an unstructured technical report, tokenizing the unstructured technical report into an n-gram array, identifying and filtering non-interesting n-grams from the first n-gram array based on common language usage of the non-interesting n-grams and a determination that the non-interesting n-grams do not appear on a confirmed technical entity database, generating and displaying a technical entity candidate list from the filtered n-gram array, displaying, obtaining, from a pattern matching model and/or a graphical user interface, an indication that a technical entity candidate is a technical entity of interest, appending the technical entity of interest to the confirmed technical entity database, generating and displaying a structured technical report with the confirmed technical entities and corresponding technical entity value parameters, and iterating the process to refine the pattern matching model.
-
2.
公开(公告)号:US11790170B2
公开(公告)日:2023-10-17
申请号:US16740066
申请日:2020-01-10
Applicant: Chevron U.S.A. Inc. , CALIFORNIA INSTITUTE OF TECHNOLOGY
Inventor: Asitang Mishra , Shuxing Cheng , Annie Didier , Chris Mattmann , Hamsa Shwetha Venkataram , Grant Lee , Wayne Moses Burke , Vishal Lall
IPC: G06F40/295 , G06N20/00 , G06F40/205 , G06F40/284 , G06F40/30
CPC classification number: G06F40/295 , G06F40/205 , G06F40/284 , G06N20/00 , G06F40/30
Abstract: A computer-implemented, machine learning-based method of converting an unstructured technical report into a structured technical report includes obtaining an unstructured technical report, tokenizing the unstructured technical report into an n-gram array, identifying and filtering non-interesting n-grams from the first n-gram array based on common language usage of the non-interesting n-grams and a determination that the non-interesting n-grams do not appear on a confirmed technical entity database, generating and displaying a technical entity candidate list from the filtered n-gram array, displaying, obtaining, from a pattern matching model and/or a graphical user interface, an indication that a technical entity candidate is a technical entity of interest, appending the technical entity of interest to the confirmed technical entity database, generating and displaying a structured technical report with the confirmed technical entities and corresponding technical entity value parameters, and iterating the process to refine the pattern matching model.
-