Abstract:
Exemplary embodiments of methods and apparatuses for automatic speech recognition are described. First model parameters associated with a first representation of an input signal are generated. The first representation of the input signal is a discrete parameter representation. Second model parameters associated with a second representation of the input signal are generated. The second representation of the input signal includes a continuous parameter representation of residuals of the input signal. The first representation of the input signal includes discrete parameters representing first portions of the input signal. The second representation includes discrete parameters representing second portions of the input signal that are smaller than the first portions. Third model parameters are generated to couple the first representation of the input signal with the second representation of the input signal. The first representation and the second representation of the input signal are mapped into a vector space.
Abstract:
Systems and processes are disclosed for predicting words in a text entry environment. Candidate words and probabilities associated therewith can be determined by combining a word n-gram language model and a character m-gram language model. Based on entered text, candidate word probabilities from the word n-gram language model can be integrated with the corresponding candidate character probabilities from the character m-gram language model. A reduction in entropy can be determined from integrated candidate word probabilities before entry of the most recent character to integrated candidate word probabilities after entry of the most recent character. If the reduction in entropy exceeds a predetermined threshold, candidate words with high integrated probabilities can be displayed or otherwise made available to the user for selection. Otherwise, displaying candidate words can be deferred (e.g., pending receipt of an additional character from the user leading to reduced entropy in the candidate set).
Abstract:
Methods, systems, and computer-readable media related to a technique for providing handwriting input functionality on a user device. A handwriting recognition module is trained to have a repertoire comprising multiple non-overlapping scripts and capable of recognizing tens of thousands of characters using a single handwriting recognition model. The handwriting input module provides real-time, stroke-order and stroke-direction independent handwriting recognition. User interfaces for providing the handwriting input functionality are also disclosed.
Abstract:
Systems and methods for proactively populating an application with information that was previously viewed by a user in a different application are disclosed herein. An example method includes: while displaying a first application, obtaining information identifying a first physical location viewed by a user in the first application. The method also includes exiting the first application and, after exiting the first application, receiving a request from the user to open a second application that is distinct from the first application. In response to receiving the request and in accordance with a determination that the second application is capable of accepting geographic location information, the method includes presenting the second application so that the second application is populated with information that is based at least in part on the information identifying the first physical location.
Abstract:
Differing embodiments of this disclosure may be employed to perform character sequence recognition with no explicit character segmentation. According to some embodiments, the character sequence recognition process may comprise generating a predicted character sequence for a first representation of a first image comprising a first plurality of pixels by: sliding a Convolutional Neural Network (CNN) classifier over the first representation of the first image one pixel position at a time until reaching an extent of the first representation of the first image; recording a likelihood value for each of k potential output classes at each pixel position, wherein one of the k potential output classes comprises a background class; determining a sequence of most likely output classes at each pixel position; decoding the sequence by removing identical consecutive output class determinations and background class determinations from the determined sequence; and validating the decoded sequence using one or more predetermined heuristics.
Abstract:
Differing embodiments of this disclosure may employ one or all of the several techniques described herein to perform credit card recognition using electronic devices with integrated cameras. According to some embodiments, the credit card recognition process may comprise: obtaining a first representation of a first image, wherein the first representation comprises a first plurality of pixels; identifying a first credit card region within the first representation; extracting a first plurality of sub-regions from within the identified first credit card region, wherein a first sub-region comprises a credit card number, wherein a second sub-region comprises an expiration date, and wherein a third sub-region comprises a card holder name; generating a predicted character sequence for the first, second, and third sub-regions; and validating the predicted character sequences for at least the first, second, and third sub-regions using various credit card-related heuristics, e.g., expected character sequence length, expected character sequence format, and checksums.