摘要:
A language input architecture receives input text (e.g., phonetic text of a character-based language) entered by a user from an input device (e.g., keyboard, voice recognition). The input text is converted to an output text (e.g., written language text of a character-based language). The language input architecture has a user interface that displays the output text and unconverted input text in line with one another. As the input text is converted, it is replaced in the UI with the converted output text. In addition to this in-line input feature, the UI enables in-place editing or error correction without requiring the user to switch modes from an entry mode to an edit mode. To assist with this in-place editing, the UI presents pop-up windows containing the phonetic text from which the output text was converted as well as first and second candidate lists that contain small and large sets of alternative candidates that might be used to replace the current output text. The language input user interface also allows a user to enter a mixed text of different languages.
摘要:
A language input architecture receives input text (e.g., phonetic text of a character-based language) entered by a user from an input device (e.g., keyboard, voice recognition). The input text is converted to an output text (e.g., written language text of a character-based language). The language input architecture has a user interface that displays the output text and unconverted input text in line with one another. As the input text is converted, it is replaced in the UI with the converted output text. In addition to this in-line input feature, the UI enables in-place editing or error correction without requiring the user to switch modes from an entry mode to an edit mode. To assist with this in-place editing, the UI presents pop-up windows containing the phonetic text from which the output text was converted as well as first and second candidate lists that contain small and large sets of alternative candidates that might be used to replace the current output text. The language input user interface also allows a user to enter a mixed text of different languages.
摘要:
A language input architecture converts input strings of phonetic text (e.g., Chinese Pinyin) to an output string of language text (e.g., Chinese Hanzi) in a manner that minimizes typographical errors and conversion errors that occur during conversion from the phonetic text to the language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. Each typing model is trained on real data, and learns probabilities of typing errors. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string.
摘要:
A language input architecture converts input strings of phonetic text to an output string of language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string.
摘要:
A language input architecture converts input strings of phonetic text to an output string of language text. The language input architecture has a search engine, typing models, a language model, and one or more lexicons for different languages. Each typing model is trained on real data, and learns probabilities of typing errors. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string.
摘要:
A language input architecture converts input strings of phonetic text to an output string of language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string.
摘要:
A language input architecture converts input strings of phonetic text (e.g., Chinese Pinyin) to an output string of language text (e.g., Chinese Hanzi) in a manner that minimizes typographical errors and conversion errors that occur during conversion from the phonetic text to the language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. Each typing model is trained on real data, and learns probabilities of typing errors. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The probable typing candidates may be stored in a database. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string. By generating typing candidates and then using the associated conversion strings to replace the input string, the architecture eliminates many common typographical errors. When multiple typing models are employed, the architecture can automatically distinguish among multiple languages without requiring mode switching for entry of the different languages.
摘要:
A language input architecture converts input strings of phonetic text to an output string of language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string.
摘要:
A method is presented comprising assigning each of a plurality of segments comprising a received corpus to a node in a data structure denoting dependencies between nodes, and calculating a transitional probability between each of the nodes in the data structure.
摘要:
The generation and management of a language model data structure include assigning each segment of a received corpus to a node in a data structure that denotes dependencies between the respective nodes. A transitional probability between each of the nodes in the data structure is calculated. A frequency of occurrence is calculated for each item of the respective segments, and those nodes of the data structure associated with items that do not meet a minimum frequency of occurrence threshold are removed. The data structure may be managed across a system memory of a computer system and an extended memory of the computer system.