-
公开(公告)号:US11720605B1
公开(公告)日:2023-08-08
申请号:US17876069
申请日:2022-07-28
申请人: Intuit Inc.
发明人: Tharathorn Rimchala , Yingxin Wang
IPC分类号: G06F16/28 , G06V30/14 , G06F16/93 , G06F16/2457
CPC分类号: G06F16/287 , G06F16/24578 , G06F16/93 , G06V30/1444
摘要: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.
-
公开(公告)号:US11989500B2
公开(公告)日:2024-05-21
申请号:US18453772
申请日:2023-08-22
申请人: INTUIT INC.
发明人: Zhewen Fan , Farzaneh Khoshnevisan , Byungkyu Kang , Yingxin Wang , Sonia Sharma
IPC分类号: G06F40/106 , G06F16/34 , G06F40/205
CPC分类号: G06F40/106 , G06F16/345 , G06F40/205
摘要: Aspects of the present disclosure provide techniques for improved automated parsing and display of electronic documents. Embodiments include identifying a set of topics in a first electronic document based on one or more rules related to one or more keywords in the first electronic document. Embodiments include providing one or more inputs to a machine learning model based on the set of topics and a second electronic document related to the first electronic document. Embodiments include receiving, from the machine learning model in response to the one or more inputs, one or more outputs related to formatting the second electronic document for display. Embodiments include generating a formatted version of the first electronic document based on the set of topics and generating a formatted version of the second electronic document based on the one or more outputs.
-
公开(公告)号:US12099539B2
公开(公告)日:2024-09-24
申请号:US17647607
申请日:2022-01-11
申请人: INTUIT INC.
发明人: Krysten Nicole Dell , Jason Heckendorn , Lin Tao , Yingxin Wang
CPC分类号: G06F16/35 , G06F16/345 , G06N7/01 , G06N20/00
摘要: Aspects of the present disclosure provide techniques for improved text classification. Embodiments include providing, based on a text string, one or more first inputs to a summary model. Embodiments include determining, based on one or more first outputs from the summary model in response to the one or more first inputs, a summarized version of the text string. In some embodiments the summarized version of the text string comprises a number of tokens that is less than or equal to a maximum number of input tokens for a machine learning model. Embodiments include providing, based on the summarized version of the text string, one or more second inputs to the machine learning model. Embodiments include determining one or more attributes of the text string based on one or more second outputs received from the machine learning model in response to the one or more second inputs.
-
公开(公告)号:US20240037125A1
公开(公告)日:2024-02-01
申请号:US18211127
申请日:2023-06-16
申请人: Intuit Inc.
发明人: Tharathorn RIMCHALA , Yingxin Wang
IPC分类号: G06F16/28 , G06V30/14 , G06F16/2457 , G06F16/93
CPC分类号: G06F16/287 , G06V30/1444 , G06F16/24578 , G06F16/93
摘要: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.
-
公开(公告)号:US11783112B1
公开(公告)日:2023-10-10
申请号:US17937086
申请日:2022-09-30
申请人: INTUIT INC.
发明人: Zhewen Fan , Farzaneh Khoshnevisan , Byungkyu Kang , Yingxin Wang , Sonia Sharma
IPC分类号: G06F40/106 , G06F16/34 , G06F40/205
CPC分类号: G06F40/106 , G06F16/345 , G06F40/205
摘要: Aspects of the present disclosure provide techniques for improved automated parsing and display of electronic documents. Embodiments include identifying a set of topics in a first electronic document based on one or more rules related to one or more keywords in the first electronic document. Embodiments include providing one or more inputs to a machine learning model based on the set of topics and a second electronic document related to the first electronic document. Embodiments include receiving, from the machine learning model in response to the one or more inputs, one or more outputs related to formatting the second electronic document for display. Embodiments include generating a formatted version of the first electronic document based on the set of topics and generating a formatted version of the second electronic document based on the one or more outputs.
-
-
-
-