摘要:
A method of identifying and using side information available to statistical machine translation systems within an enterprise setting, the method including extracting user-specific interaction and non-interaction-based information from at least one corresponding database within the enterprise for each of a plurality of users, aggregating the user-specific interaction and non-interaction based information from a plurality of users, by using a processor on a computer, to tune and adapt background translation and language models, and updating all relevant models within the enterprise after user activity based on the tuned and adapted translation and language models.
摘要:
A system and method for training a statistical machine translation model and decoding or translating using the same is disclosed. A source word versus target word co-occurrence matrix is created to define word pairs. Dimensionality of the matrix may be reduced. Word pairs are mapped as vectors into continuous space where the word pairs are vectors of continuous real numbers and not discrete entities in the continuous space. A machine translation parametric model is trained using an acoustic model training method based on word pair vectors in the continuous space.
摘要:
A system and method for speech recognition includes generating a set of likely hypotheses in recognizing speech, rescoring the likely hypotheses by using semantic content by employing semantic structured language models, and scoring parse trees to identify a best sentence according to the sentence's parse tree by employing the semantic structured language models to clarify the recognized speech.
摘要:
A language processing application employs a classing function optimized for the underlying production application context for which it is expected to process speech. A combination of class based and word based features generates a classing function optimized for a particular production application, meaning that a language model employing the classing function uses word classes having a high likelihood of accurately predicting word sequences encountered by a language model invoked by the production application. The classing function optimizes word classes by aligning the objective of word classing with the underlying language processing task to be performed by the production application. The classing function is optimized to correspond to usage in the production application context using class-based and word-based features by computing a likelihood of a word in an n-gram and a frequency of a word within a class of the n-gram.
摘要:
Techniques for monitoring a set of one or more event counters of application execution are provided. The techniques include constructing a virtual performance monitoring counter (VPMC) layer as a unified abstraction of a physical performance monitoring counter (PMC) architecture, and incorporating one or more programming interfaces (PIs) in connection with the virtual performance monitoring counter, wherein the one or more programming interfaces facilitate simultaneous access and data monitoring across a set of one or more event counters.
摘要:
A method of identifying and using side information available to statistical machine translation systems within an enterprise setting, the method including extracting user-specific interaction and non-interaction-based information from at least one corresponding database within the enterprise for each of a plurality of users, aggregating the user-specific interaction and non-interaction based information from a plurality of users, by using a processor on a computer, to tune and adapt background translation and language models, and updating all relevant models within the enterprise after user activity based on the tuned and adapted translation and language models.
摘要:
A method, apparatus and computer instructions is provided for fast semi-automatic semantic annotation. Given a limited annotated corpus, the present invention assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine. A rover then combines the parse trees from the three engines and annotates the next chunk of limited annotated corpus with confidence, such that the efforts required for human annotation is reduced.
摘要:
Operation of an automated dialog system is described using a source language to conduct a real time human machine dialog process with a human user using a target language. A user query in the target language is received and automatically machine translated into the source language. An automated reply of the dialog process is then delivered to the user in the target language. If the dialog process reaches an initial assistance state, a first human agent using the source language is provided to interact in real time with the user in the target language by machine translation to continue the dialog process. Then if the dialog process reaches a further assistance state, a second human agent using the target language is provided to interact in real time with the user in the target language to continue the dialog process.
摘要:
Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.
摘要:
Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.