Invention Application
- Patent Title: MULTI-LINGUAL WORD HYPHENATION USING INDUCTIVE MACHINE LEARNING ON TRAINING DATA
- Patent Title (中): 使用感应机器学习培训数据的多语言词汇
-
Application No.: US12015489Application Date: 2008-01-16
-
Publication No.: US20090182550A1Publication Date: 2009-07-16
- Inventor: Siarhei Alonichau , Ravi Shahani , Kevin Powell
- Applicant: Siarhei Alonichau , Ravi Shahani , Kevin Powell
- Applicant Address: US WA Redmond
- Assignee: MICROSOFT CORPORATION
- Current Assignee: MICROSOFT CORPORATION
- Current Assignee Address: US WA Redmond
- Main IPC: G06F17/28
- IPC: G06F17/28

Abstract:
Tools and techniques are described for providing multi-lingual word hyphenation using inductive machine learning on training data. Methods provided by these techniques may receive training data that includes hyphenated words, and may inductively generate hyphenation patterns that represent substrings of these words. The hyphenation patterns may include the substrings and hyphenation codes associated with characters occurring in the substrings. The methods may receive induction parameters applicable to generating the hyphenation patterns, and may store the hyphenation patterns into a language-specific lexicon file. These methods may also receive requests to hyphenate input words that occur in a human language, and may evaluate how to process the request based on the language. The methods may search for hyphenation patterns occurring in the input words, with the hyphenation patterns being stored in the lexicon file. Finally, the methods may respond to the request, indicating whether the hyphenation patterns occurred in the input words.
Public/Granted literature
- US08996994B2 Multi-lingual word hyphenation using inductive machine learning on training data Public/Granted day:2015-03-31
Information query