- 专利标题: Text extraction and processing
-
申请号: US16287323申请日: 2019-02-27
-
公开(公告)号: US10963490B2公开(公告)日: 2021-03-30
- 发明人: Tohru Hasegawa , Hiroaki Uetsuki , Shunsuke Ishikawa , Issei Yoshida , Asako Ono , Yasuyuki Tominaga , Kenta Watanabe , Hiroaki Kikuchi
- 申请人: International Business Machines Corporation
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 代理机构: Lieberman & Brandsdorfer, LLC
- 主分类号: G06F7/00
- IPC分类号: G06F7/00 ; G06F16/31 ; G06N5/02 ; G06F16/93
摘要:
A system, computer program product, and method are provided to selectively index one or more subsets of documents or files. As data is extracted from a document or file, extracted text is organized into data portions and subject to evaluations. Meta characteristic data is leveraged to assess the extracted text. One or more subsets of the organized data portions are selectively identified and subject to enrichment processing, which creates and returns enriched and indexed subsets of the documents or files.
公开/授权文献
- US20200272648A1 Text Extraction and Processing 公开/授权日:2020-08-27
信息查询