摘要:
A method includes obtaining first training data having multiple first linguistic samples. The method also includes generating second training data using the first training data and multiple symmetries. The symmetries identify how to modify the first linguistic samples while maintaining structural invariants within the first linguistic samples, and the second training data has multiple second linguistic samples. The method further includes training a machine learning model using at least the second training data. At least some of the second linguistic samples in the second training data are selected during the training based on a likelihood of being misclassified by the machine learning model.
摘要:
In some aspects, a method includes performing optical character recognition (OCR) based on data corresponding to a document to generate text data, detecting one or more bounded regions from the data based on a predetermined boundary rule set, and matching one or more portions of the text data to the one or more bounded regions to generate matched text data. Each bounded region of the one or more bounded regions encloses a corresponding block of text. The method also includes extracting features from the matched text data to generate a plurality of feature vectors and providing the plurality of feature vectors to a trained machine-learning classifier to generate one or more labels associated with the one or more bounded regions. The method further includes outputting metadata indicating a hierarchical layout associated with the document based on the one or more labels and the matched text data.
摘要:
An example of a non-transitory computer-readable medium storing machine-readable instructions. The instructions may cause a controller to receive an image and detect an object in the image. Based on a contextual setting of the electronic device, overlay data may be retrieved from a database or a remote electronic device to be visually associated with the object on a display.
摘要:
A content device and method is disclosed to include a processing device to process streaming video content. A fingerprinter receives captured frames of the streaming video content and, for each frame of a plurality of the captured frames, generates a one-dimensional histogram function of pixel values and transforms the histogram function with a Fast Fourier Transform (FFT), to generate a plurality of complex values for the frame. The fingerprinter further, for each of the plurality of complex values, assigns a binary one ("1") when a real part of the complex value is greater than zero ("0") and assigns a binary zero ("0") when the real part is less than or equal to zero, to generate a plurality of bits. The fingerprinter further concatenates a specific number of the bits to generate a fingerprint for the frame.
摘要:
Method, system and media for authenticating a subject as a user. Embodiments generate visual stories specific to the user and for which the subject must select the corresponding images from among a plurality of decoy images. Gaze tracking can be used to determine which images the user has selected without allowing an observer to learn which images have been selected. Images for the visual story can be retrieved from the user's social networking profile, and corresponding text storied generated to indicate which images should be selected. Multiple security levels are possible by varying the number of story images and decoy images.
摘要:
An apparatus comprising at least one processor; and at least one memory, the memory comprising computer program code stored thereon, the at least one memory and computer program code being configured to, when run on the at least one processor, cause the apparatus to: process a passage of electronic text to identify at least one word associated with a geographical location in the passage of electronic text; search for an image-based representation of the geographical location associated with the at least one identified word; and output the image-based representation of the geographical location to a display.
摘要:
A mobile device can receive OCR library information associated with a coarse position. The coarse position can be determined by the mobile device, or by a network server configured to communicate with the mobile device. A camera on the mobile device can obtain images of human-readable information in an area near the coarse position. The view finder image can be processed with an OCR engine that is utilizing the OCR library information to determine one or more location string values. A location database can be searched based on the location string values. The position of the mobile device can be estimated and displayed. The position estimated can be adjusted based on the proximity of the mobile device to other features in the image.