摘要:
Computer-implemented techniques include receiving a phrase in a first language and obtaining a corpus comprising a plurality of phrases in the first language and word reordering information for the plurality of phrases, the word reordering information indicating a correct word order for each phrase in a second language. Word-to-word correspondences between each of the phrases in the first language and the corresponding correct word order for the phrase in the second language are identified and at least one tree that allows for the identified word-to-word correspondences is generated. Based upon the at least one tree, a statistical model for reordering from a word order that is correct for the first language to a word order that is correct for the second language is created. Based upon the statistical model, a reordered phrase from the received phrase is generated, the reordered phrase having a correct word order for the second language.
摘要:
A computer-implemented technique can include receiving, at a server including one or more processors, a source word in a source language. The technique can include determining, at the server, one or more potential translations for the source word in a target language different than the source language. The technique can include determining, at the server, one or more synonyms for each of the one or more potential translations to obtain a plurality of potential translations. The technique can include determining, at the server, one or more translation clusters using the plurality of potential translations and a clustering algorithm. Each translation cluster can contain all of the plurality of potential translations that have a similar denotation and each of the plurality of translations that have a similar denotation can be included in a specific translation cluster. The technique can also include outputting, at the server, the one or more translation clusters.
摘要:
A computer-implemented technique can include receiving, at a server including one or more processors, a source word in a source language. The technique can include determining, at the server, one or more potential translations for the source word in a target language different than the source language. The technique can include determining, at the server, one or more synonyms for each of the one or more potential translations to obtain a plurality of potential translations. The technique can include determining, at the server, one or more translation clusters using the plurality of potential translations and a clustering algorithm. Each translation cluster can contain all of the plurality of potential translations that have a similar denotation and each of the plurality of translations that have a similar denotation can be included in a specific translation cluster. The technique can also include outputting, at the server, the one or more translation clusters.