Invention Grant
US07827133B2 Method and arrangement for SIM algorithm automatic charset detection
有权
SIM算法自动字符集检测的方法和布置
- Patent Title: Method and arrangement for SIM algorithm automatic charset detection
- Patent Title (中): SIM算法自动字符集检测的方法和布置
-
Application No.: US12714392Application Date: 2010-02-26
-
Publication No.: US07827133B2Publication Date: 2010-11-02
- Inventor: Lili Diao
- Applicant: Lili Diao
- Applicant Address: US CA Cupertino
- Assignee: Trend Micro Inc.
- Current Assignee: Trend Micro Inc.
- Current Assignee Address: US CA Cupertino
- Agency: IP Strategy Group, P.C.
- Main IPC: G06F15/00
- IPC: G06F15/00 ; G06F15/18

Abstract:
The invention relates, in an embodiment, to a computer-implemented method for handling a target document, the target document having been transmitted electronically and involving an encoding scheme. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using SIM (Similarity Algorithm) to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme. The method including decoding the target document to obtain decoded content of the document based on at least the first encoding scheme.
Public/Granted literature
- US20100153320A1 METHOD AND ARRANGEMENT FOR SIM ALGORITHM AUTOMATIC CHARSET DETECTION Public/Granted day:2010-06-17
Information query