摘要:
Individuals may interact with automated services as one or more parties, where such individuals may have collective (as well as individual) intents. Moreover, parties may concurrently communicate with the interface, and the interface may have to manage several concurrent interactions with different parties. Single-individual interfaces may be unable to react robustly to such dynamic and complex real-world scenarios. Instead, multi-party interfaces to service components may be devised that identify individuals within a scene, associate the individuals with parties, track a set of interactions of the parties with the service component, and direct the service component in interacting with the parties. A multi-party interface may also detect and politely handle interruptions, and may identify information items about individuals and parties based on context and history, prioritize the intents of the individuals and parties, and triage interactions accordingly.
摘要:
Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.
摘要:
Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.
摘要:
Individuals may interact with automated services as one or more parties, where such individuals may have collective (as well as individual) intents. Moreover, parties may concurrently communicate with the interface, and the interface may have to manage several concurrent interactions with different parties. Single-individual interfaces may be unable to react robustly to such dynamic and complex real-world scenarios. Instead, multi-party interfaces to service components may be devised that identify individuals within a scene, associate the individuals with parties, track a set of interactions of the parties with the service component, and direct the service component in interacting with the parties. A multi-party interface may also detect and politely handle interruptions, and may identify information items about individuals and parties based on context and history, prioritize the intents of the individuals and parties, and triage interactions accordingly.