- 专利标题: Cleaning noise words from transaction descriptions
-
申请号: US15408241申请日: 2017-01-17
-
公开(公告)号: US10546348B1公开(公告)日: 2020-01-28
- 发明人: Christopher Lesner , Alexander Ran
- 申请人: Christopher Lesner , Alexander Ran
- 申请人地址: US CA Mountain View
- 专利权人: Intuit Inc.
- 当前专利权人: Intuit Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Ferguson Braswell Fraser Kubasta PC
- 主分类号: G06Q40/02
- IPC分类号: G06Q40/02 ; G06F17/27
摘要:
A method, system, and non-transitory computer readable medium for removing noise ngrams from transaction records. The method may include obtaining noise ngrams; ordering the noise ngrams based on frequency of occurrence; discarding a portion of the noise ngrams below a frequency threshold to obtain a higher frequency subset of the noise ngrams; obtaining a transaction record of interest; and identifying a portion of the higher frequency subset within the transaction record of interest. Identifying the portion of the higher frequency subset may include constructing a regular expression based on the higher frequency subset; constructing a finite state machine based on the regular expression; providing the transaction record of interest as an input to the finite state machine; and executing the finite state machine. The method may also include removing, based on the identification, the portion of the higher frequency subset from the transaction record of interest.
信息查询