摘要:
A language processing system may determine a display form of a spoken word by analyzing the spoken form using a language model that includes dictionary entries for display forms of homonyms. The homonyms may include trade names as well as given names and other phrases. The language processing system may receive spoken language and produce a display form of the language while displaying the proper form of the homonym. Such a system may be used in search systems where audio input is converted to a graphical display of a portion of the spoken input.
摘要:
A multimedia system configured to receive user input in the form of a spelled character sequence is provided. In one implementation, a spell mode is initiated, and a user spells a character sequence. The multimedia system performs spelling recognition and recognizes a sequence of character representations having a possible ambiguity resulting from any user and/or system errors. The sequence of character representations with the possible ambiguity yields multiple search keys. The multimedia system performs a fuzzy pattern search by scoring each target item from a finite dataset of target items based on the multiple search keys. One or more relevant items are ranked and presented to the user for selection, each relevant item being a target item that exceeds a relevancy threshold. The user selects the indented character sequence from the one or more relevant items.
摘要:
This document describes word-dependent language models, as well as their creation and use. A word-dependent language model can permit a speech-recognition engine to accurately verify that a speech utterance matches a multi-word phrase. This is useful in many contexts, including those where one or more letters of the expected phrase are known to the speaker.
摘要:
An answering machine detection module is used to determine whether a call recipient is an actual person or an answering machine. The answering machine detection module includes a speech recognizer and a call analysis module. The speech recognizer receives an audible response of the call recipient to a call. The speech recognizer processes the audible response and provides an output indicative of recognized speech. The call analysis module processes the output of the speech recognizer to generate an output indicative of whether the call recipient is a person or an answering machine.
摘要:
The presentation of location information to a user that is distracted by traveling can result in the user quickly forgetting, or never even comprehending, key parts of the location information, such as the street number. Identification can be made of intersections and points of interest near the user's destination, which can then be provided instead of, or in addition to, the address, thereby increasing user comprehension and retention, especially when distracted. Map data can be parsed into addresses, intersections and points of interest databases. These databases can be accessed to identify proximate intersections and points of interest, which can then be filtered and subsequently ranked to identify one intersection, one point of interest, or both, that can be presented to the user to aid the user in comprehending and retaining the location information even when distracted.
摘要:
A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.
摘要:
A method of forming a shareable filler model (shareable model for garbage words) from a word n-gram model is provided. The word n-gram model is converted into a probabilistic context free grammar (PCFG). The PCFG is modified into a substantially application-independent PCFG, which constitutes the shareable filler model.
摘要:
An automated “Voice Search Message Service” provides a voice-based user interface for generating text messages from an arbitrary speech input. Specifically, the Voice Search Message Service provides a voice-search information retrieval process that evaluates user speech inputs to select one or more probabilistic matches from a database of pre-defined or user-defined text messages. These probabilistic matches are also optionally sorted in terms of relevancy. A single text message from the probabilistic matches is then selected and automatically transmitted to one or more intended recipients. Optionally, one or more of the probabilistic matches are presented to the user for confirmation or selection prior to transmission. Correction or recovery of speech recognition errors avoided since the probabilistic matches are intended to paraphrase the user speech input rather than exactly reproduce that speech, though exact matches are possible. Consequently, potential distractions to the user are significantly reduced relative to conventional speech recognition techniques.
摘要:
Training data may be provided, the training data including pairs of source phrases and target phrases. The pairs may be used to train an intra-language statistical machine translation model, where the intra-language statistical machine translation model, when given an input phrase of text in the human language, can compute probabilities of semantic equivalence of the input phrase to possible translations of the input phrase in the human language. The statistical machine translation model may be used to translate between queries and listings. The queries may be text strings in the human language submitted to a search engine. The listing strings may be text strings of formal names of real world entities that are to be searched by the search engine to find matches for the query strings.
摘要:
The claimed subject matter according to one aspect provides systems and/or methods that effectuate user development, customization, or utilization of dynamically configurable dialogue flow systems. The system can include devices and components that employ data associated with a user to retrieve navigation panes unique with respect to the user, scans the navigation panes and identifies adjustable attributes, utilizes the adjustable attributes to generate voice prompts communicated to the user via handheld devices, the user in reply to the voice prompts utters personalized responses associated with the voice prompts, and based at least on the personalized responses initiates actions associated with the adjustable attributes.