-
71.
公开(公告)号:US20230075113A1
公开(公告)日:2023-03-09
申请号:US18055338
申请日:2022-11-14
IPC分类号: G06F40/232 , G06F40/58
摘要: A system, method and computer-readable storage devices for providing unsupervised normalization of noisy text using distributed representation of words. The system receives, from a social media forum, a word having a non-canonical spelling in a first language. The system determines a context of the word in the social media forum, identifies the word in a vector space model, and selects an “n-best” vector paths in the vector space model, where the n-best vector paths are neighbors to the vector space path based on the context and the non-canonical spelling. The system can then select, based on a similarity cost, a best path from the n-best vector paths and identify a word associated with the best path as the canonical version.
-
公开(公告)号:US20230071799A1
公开(公告)日:2023-03-09
申请号:US17810775
申请日:2022-07-05
IPC分类号: G06F40/166 , G06F40/30 , G06F40/279 , G06F40/205 , G06F40/232 , G06F40/253 , G06F40/247 , G06N3/08 , G06N3/04
摘要: A system and method for extracting suggestions from review text is disclosed. The disclosed methods include utilizing natural language processing techniques and knowledge graphs to extract implicit suggestions from review text. In this way, conflicting descriptions can be eliminated and similar descriptions can be consolidated. In later operations, the pruned knowledge graphs may be converted into textual summaries to provide more concise suggestions from the raw review text.
-
公开(公告)号:US11568152B2
公开(公告)日:2023-01-31
申请号:US16930471
申请日:2020-07-16
申请人: Citrix Systems, Inc.
发明人: Lampros Dounis
IPC分类号: G06F40/35 , G06N20/00 , G06F40/253 , G06F40/247 , G06F40/232 , G06F40/289 , G06N5/04 , G06F40/279 , G06F40/30 , H04L67/01 , G06F40/205
摘要: A computer system configured for autonomous learning of entity values is provided. The computer system includes a memory that stores associations between entities and fields of response data. The computer system also includes a processor configured to receive a request to process an intent; generate a request to fulfill the intent; transmit the request to a fulfillment service; receive, from the fulfillment service, response data specifying values of the fields; identify the values of the fields within the response data; identify the entities via the associations using the fields; store, within the memory, the values of the fields as values of the entities; and retrain a natural language processor using the values of the entities.
-
公开(公告)号:US20220415070A1
公开(公告)日:2022-12-29
申请号:US17903804
申请日:2022-09-06
申请人: c/o Wacom Co., Ltd.
IPC分类号: G06V30/18 , G06F3/04883 , G06V30/32 , G06F40/232
摘要: An ink data modification or correction method, and an information processing device and a program for implementing the method are provided, which allow automatic correction of ink data including a spelling error in a handwritten character string. An ink data modification method according to the present disclosure includes determining a modification method of ink data by detecting a spelling error included in a handwritten character string represented by the ink data, and modifying the ink data by manipulating the ink data on the basis of the determined modification method. For example, the determined modification method may be to add a missing character, or to delete a superfluous character, or to correct a typo by replacing an erroneous character with a correct character.
-
75.
公开(公告)号:US11526657B2
公开(公告)日:2022-12-13
申请号:US17375225
申请日:2021-07-14
发明人: Chenhui Li , Teng Hu , Yongfeng Chen
IPC分类号: G06F40/166 , G06F40/194 , G06F40/279 , G06F40/103 , G06F40/232
摘要: This application discloses a method, an apparatus and an electronic device for error correction of numerical contents in a text, and relates to a technology field of artificial intelligence such as natural language processing and deep learning. The implementation method is: obtaining a target text to be processed; determining original numerical contents included in the target text; determining target types corresponding to the original numerical contents; and performing error correction on each original numerical content according to an error correction manner corresponding to each target type. Therefore, the error correction of numerical contents is realized according to types of the numerical contents, which is not only limited to the error correction of the numerical format, but also the logical error correction of the numerical content, so as to improve the numerical error correction capability and thereby improving the recall rate of detection and correction of wrong values.
-
公开(公告)号:US11481547B2
公开(公告)日:2022-10-25
申请号:US17142718
申请日:2021-01-06
申请人: TENCENT AMERICA LLC
发明人: Tao Yang , Zeyu You , Min Tu , Shangqing Zhang , Xu Wang , Lianyi Han , Wei Fan
IPC分类号: G06F40/53 , G06F40/129 , G06F40/232 , G06F40/274
摘要: A method, computer program, and computer system is provided for text error identification and correction. A text input having a phonetic component and a glyphic component is received. Information corresponding to the phonetic component and the glyphic component is coded as a fixed-length sequence. One or more candidate replacement words corresponding to the fixed-length sequence are identified. At least a portion of the text input is replaced with a candidate replacement word from among the one or more candidate replacement words.
-
公开(公告)号:US20220309247A1
公开(公告)日:2022-09-29
申请号:US17347773
申请日:2021-06-15
发明人: Jithu R. Jacob , Siddhartha Das
IPC分类号: G06F40/30 , G06N20/00 , G06N7/00 , G06F40/117 , G06F40/232
摘要: The present invention provides for improving training dataset by identifying errors in training dataset and generating improvement recommendations. In operation, the present invention provides for identifying and correcting duplicate utterances in training dataset comprising utterances-intent pairs. Further, a plurality of Natural Language ML models are trained with the corrected training dataset to obtain diverse set of trained ML models. Each utterance of training dataset are fed as input to trained ML models, and a probability of error associated with each utterances-intent pairs of training dataset are evaluated based on analysis of respective intent predictions received from each of the trained ML models. Furthermore, spelling errors in the dataset are identified and data-imbalances in the training dataset are evaluated. Finally, a set of improvement recommendations for each utterances-intent pair is generated based on evaluated probability of errors, spelling errors, duplicate utterances and data imbalances.
-
78.
公开(公告)号:US11436409B2
公开(公告)日:2022-09-06
申请号:US16458427
申请日:2019-07-01
发明人: Zhipeng Wu , Zhihua Wang , Tianxing Yang
IPC分类号: G06F40/253 , G06F40/295 , G06F40/232 , G06F40/30
摘要: Embodiments disclose a method and apparatus for updating subject name information of a target information source. A specific embodiment of the method includes: acquiring at least one subject name from to-be-processed information; matching, for a subject name in the at least one subject name, the subject name with pre-acquired at least one initial information source subject name, and setting a weight for the subject name based on a matching result; and sorting the subject name in the at least one subject name and an initial information source subject name in the at least one initial information source subject name according to the weight to obtain updated at least one initial information source subject name.
-
公开(公告)号:US11379661B2
公开(公告)日:2022-07-05
申请号:US16821055
申请日:2020-03-17
申请人: Xerox Corporation
发明人: Keith L. Willis
IPC分类号: G06F40/232 , G06F40/253 , G06F40/166
摘要: Disclosed are methods of displaying and editing a document having editable text that allows for a wide variety of word verification features beyond merely correcting errors in spelling or grammar and allows for such detailed editing in a simple and user-friendly manner.
-
80.
公开(公告)号:US20220164530A1
公开(公告)日:2022-05-26
申请号:US17327072
申请日:2021-05-21
发明人: Hyukchul Kwon , Jung-Hun Lee
IPC分类号: G06F40/232 , G06F40/242 , G06F40/279 , G06F40/30 , G06F40/166
摘要: Disclosed is a system for generating test documents for context-sensitive spelling error correction. The system includes: an input unit inputting an error-free document for generating an error document; an error target word segment test unit checking possibility of an error in a word segment by sequentially examining word segments of the entire sentences in the document input through the input unit and searching for a candidate word appearing at the corresponding position together with surrounding context; an error word candidate selection unit selecting error word candidates among candidate words found at the corresponding position by considering edit distances to a correct word and keyboard typographical errors; and an error word determination and presentation unit calculating probabilities of an error word candidate and its surrounding context and determining an error word of the highest priority as a final error word.
-
-
-
-
-
-
-
-
-