摘要:
Techniques are provided for determining and using interaction models. Discourse functions, prosodic features and turn information are determined from the speech information in a training corpus. Statistics, decision trees, rules and/or various other methods are used to determine a predictive interaction model based on the discourse functions, the prosodic features and the turn information. Predictive interaction models are optionally determined for individual users, genres, languages and/or other characteristics of the speech information. The predictive interaction model is useable to predict turns in a dialogue based on the discourse functions and prosodic features identified in the speech information. Speech information is presented and/or received based on the predictive interaction model.
摘要:
Techniques are provided for determining predictive models of discourse functions based on prosodic features of natural language speech. Inter and intra sentential discourse functions in a training corpus of natural language speech utterances are determined. The discourse functions are clustered. The exemplary prosodic features associated with each type of discourse function are determined. Machine learning, observation and the like are used to determine a subset of prosodic features associated with each type of discourse function useful in predicting the likelihood of each type of discourse function.
摘要:
Techniques are provided for resolving ambiguity in natural language speech. Speech is recognized using automatic speech recognition. A theory of discourse analysis is determined and at least one set of candidate discourse functions is determined based on the theory of discourse analysis. Prosodic features in the speech and a correlation between the prosodic features and the discourse functions is determined. The sets of candidate discourse functions are ranked based on the prosodic features in the speech information and a correlation to the prosodic features expected for the determined discourse functions. Ambiguity is resolved between sets of candidate discourse functions based on the rank information.
摘要:
Techniques are provided for determining and using interaction models. Discourse functions, prosodic features and turn information are determined from the speech information in a training corpus. Statistics, decision trees, rules and/or various other methods are used to determine a predictive interaction model based on the discourse functions, the prosodic features and the turn information. Predictive interaction models are optionally determined for individual users, genres, languages and/or other characteristics of the speech information. The predictive interaction model is useable to predict turns in a dialogue based on the discourse functions and prosodic features identified in the speech information. Speech information is presented and/or received based on the predictive interaction model.
摘要:
Techniques are provided for determining predictive models of discourse functions based on prosodic features of natural language speech. Inter and intra sentential discourse functions in a training corpus of natural language speech utterances are determined. The discourse functions are clustered. The exemplary prosodic features associated with each type of discourse function are determined. Machine learning, observation and the like are used to determine a subset of prosodic features associated with each type of discourse function useful in predicting the likelihood of each type of discourse function.
摘要:
Techniques are provided for synthesizing speech using discourse function level prosodic features. An output text is determined. The discourse functions within the text are determined based on a theory of discourse analysis such as the Unified Linguistic Discourse Model. The salient prosodic features associated with the discourse functions are identified using a predictive model of discourse functions or some other model of salient prosodic features. The discourse functions are transformed into synthesized speech. Discourse function level prosodic feature adjustments are determined and applied to the synthesized speech is output.
摘要:
Techniques are provided for resolving ambiguity in natural language speech. Speech is recognized using automatic speech recognition. A theory of discourse analysis is determined and at least one set of candidate discourse functions is determined based on the theory of discourse analysis. Prosodic features in the speech and a correlation between the prosodic features and the discourse functions is determined. The sets of candidate discourse functions are ranked based on the prosodic features in the speech information and a correlation to the prosodic features expected for the determined discourse functions. Ambiguity is resolved between sets of candidate discourse functions based on the rank information.