摘要:
A method, computer system, and computer program product for translating information. The computer system receives the information for a translation. The computer system identifies portions of the information based on a set of rules for security for the information in response to receiving the information. The computer system sends the portions of the information to a plurality of translation systems. In response to receiving translation results from the plurality of translation systems for respective portions of the information, the computer system combines the translation results for the respective portions to form a consolidated translation of the information.
摘要:
A method, computer system, and computer program product for translating information. The computer system receives the information for a translation. The computer system identifies portions of the information based on a set of rules for security for the information in response to receiving the information. The computer system sends the portions of the information to a plurality of translation systems. In response to receiving translation results from the plurality of translation systems for respective portions of the information, the computer system combines the translation results for the respective portions to form a consolidated translation of the information.
摘要:
Methods and systems for fast translation memory search include, in response to an input query string, identifying a plurality of hypothesis strings stored in a translation memory as candidates to match the query string. One or more candidates are eliminated, using a processor, where string lengths between the candidates and the query string are at least a cutoff value representing a string edit distance. One or more candidates are eliminated where differences in word frequency distributions between the candidates and the query string are at least the cutoff value. One or more candidates are eliminated by employing a dynamic programming matrix where string edit distances between the candidates and the query string are at least the cutoff value. A number of remaining candidates are outputted as matches to the query string.
摘要:
Methods and systems for fast translation memory search include, in response to an input query string, identifying a plurality of hypothesis strings stored in a translation memory as candidates to match the query string. One or more candidates are eliminated, using a processor, where string lengths between the candidates and the query string are at least a cutoff value representing a string edit distance. One or more candidates are eliminated where differences in word frequency distributions between the candidates and the query string are at least the cutoff value. One or more candidates are eliminated by employing a dynamic programming matrix where string edit distances between the candidates and the query string are at least the cutoff value. A number of remaining candidates are outputted as matches to the query string.
摘要:
A method, system, and computer readable storage medium including a computer readable program are provided. The method includes storing a set of sentences in a memory device. The method further includes receiving an input translated phrase and searching the set of sentences for a subset of sentences closest to the input translated phrase based on a set of respective distances to the sentences in the set with respect to the input translated phrase. The method also includes calculating and outputting a language model score for the subset of sentences based on a function of a subset of respective distances pertaining to the subset of sentences.
摘要:
A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
摘要:
A system and method for representing call content in a searchable database includes transcribing call content to text. The call content is projected to vector space, by creating a vector by indexing the call based on the content and determining a similarity of the call to an atomic-class dictionary. The call is classified in a relational database in accordance with the vector.
摘要:
Techniques for detecting data anomalies in a natural language understanding (NLU) system are provided. A number of categorized sentences, categorized into a number of categories, are obtained. Sentences within a given one of the categories are clustered into a number of sub clusters, and the sub clusters are analyzed to identify data anomalies. The clustering can be based on surface forms of the sentences. The anomalies can be, for example, ambiguities or inconsistencies. The clustering can be performed, for example, with a K-means clustering algorithm.