Abstract:
Systems and processes for detecting an event within natural language are provided. In one example of a process, unstructured natural language information may be received from at least one user. The presence of event information in the unstructured natural language information may be determined. In accordance with a determination that event information is present within the unstructured natural language information, a pseudo-event entry associated with that event information may be generated.
Abstract:
Systems and processes for discourse input processing are provided. In one example process, a discourse input can be received from a user. An integrated probability of a candidate word in the discourse input and one or more subclasses associated with the candidate word can be determined based on a conditional probability of the candidate word given one or more words in the discourse input, a probability of the candidate word within a corpus, and a conditional probability of the candidate word given one or more classes associated with the one or more words. A text string corresponding to the discourse input can be determined based on the integrated probability. An output based on the text string can be generated.
Abstract:
Systems and processes are disclosed for predicting words in a text entry environment. Candidate words and probabilities associated therewith can be determined by combining a word n-gram language model and a character m-gram language model. Based on entered text, candidate word probabilities from the word n-gram language model can be integrated with the corresponding candidate character probabilities from the character m-gram language model. A reduction in entropy can be determined from integrated candidate word probabilities before entry of the most recent character to integrated candidate word probabilities after entry of the most recent character. If the reduction in entropy exceeds a predetermined threshold, candidate words with high integrated probabilities can be displayed or otherwise made available to the user for selection. Otherwise, displaying candidate words can be deferred (e.g., pending receipt of an additional character from the user leading to reduced entropy in the candidate set).
Abstract:
Methods, systems, and computer-readable media related to a technique for combining two or more aspects of predictive information for auto-completion of user input, in particular, user commands directed to an intelligent digital assistant. Specifically, predictive information based on (1) usage frequency, (2) usage recency, and (3) semantic information encapsulated in an ontology (e.g., a network of domains) implemented by the digital assistant, are integrated in a balanced and sensible way within a unified framework, such that a consistent ranking of all completion candidates across all domains may be achieved. Auto-completions are selected and presented based on the unified ranking of all completion candidates.
Abstract:
Methods, systems, and computer-readable media related to a technique for providing handwriting input functionality on a user device. A handwriting recognition module is trained to have a repertoire comprising multiple non-overlapping scripts and capable of recognizing tens of thousands of characters using a single handwriting recognition model. The handwriting input module provides real-time, stroke-order and stroke-direction independent handwriting recognition. In some embodiments, temporally-derived features are used to improve recognition accuracy without compromising the stroke-order and stroke-direction independence of the recognition system.
Abstract:
Systems and processes for operating an intelligent automated assistant are provided. An example process includes, receiving a text and a set of contextual information associated with the text; determining, using a system of neural networks, a plurality of text predictions based on the text and the contextual information, wherein a first text prediction of the plurality of text predictions includes a word and a second text prediction of the plurality of text predictions includes a phrase and wherein the system of neural networks includes a first neural network for extracting a context, a second neural network for determining text predictions, and a third neural network for determining whether the text predictions are relevant to the context; and in accordance with a determination that a plurality of confidence scores associated with the plurality of text predictions exceed a predetermined threshold, providing the plurality of text predictions.
Abstract:
Methods, systems, and computer-readable media related to a technique for providing handwriting input functionality on a user device. A handwriting recognition module is trained to have a repertoire comprising multiple non-overlapping scripts and capable of recognizing tens of thousands of characters using a single handwriting recognition model. The handwriting input module provides real-time, stroke-order and stroke-direction independent handwriting recognition for multi-character handwriting input. In particular, real-time, stroke-order and stroke-direction independent handwriting recognition is provided for multi-character, or sentence level Chinese handwriting recognition. User interfaces for providing the handwriting input functionality are also disclosed.
Abstract:
Systems and processes for modifying word predictions are provided. In one example, a user input is received including one or more words. A prediction of a word sequence corresponding to one or more words is obtained, and context information associated with the word sequence is obtained. In accordance with a determination, based on the context information, that the prediction of the word sequence corresponds to a predetermined semantic reference, the prediction of the word sequence is modified, and an output is provided corresponding to the modified prediction of the word sequence. In accordance with a determination, based on the context information, that the prediction of the word sequence does not correspond to a predetermined semantic reference, an output is provided corresponding to the prediction of the word sequence.
Abstract:
Systems and processes for operating a digital assistant are provided. In accordance with one or more examples, a method includes, receiving training data for a data-driven learning network. The training data include a plurality of word sequences. The method further includes obtaining representations of an initial set of semantic categories associated with the words included in the training data; and training the data-driven learning network based on the plurality of word sequences included in the training data and based on the representations of the initial set of semantic categories. The training is performed using the word sequences in their entirety. The method further includes obtaining, based on the trained data-driven learning network, representations of a set of semantic embeddings of the words included in the training data; and providing the representations of the set of semantic embeddings to at least one of a plurality of different natural language processing tasks.
Abstract:
The present disclosure generally relates to systems and processes for morpheme-based word prediction. An example method includes receiving a current word; determining a context of the current word based on the current word and a context of a previous word; determining, using a morpheme-based language model, a likelihood of a prefix based on the context of the current word; determining, using the morpheme-based language model, a likelihood of a stem based on the context of the current word; determining, using the morpheme-based language model, a likelihood of a suffix based on the context of the current word; determining a next word based on the likelihood of the prefix, the likelihood of the stem, and the likelihood of the suffix; and providing an output including the next word.