Abstract:
An apparatus and method are provided for identifying and audibly presenting textual information within captured image data. In one implementation, a method is provided for audibly presenting text retrieved from a captured image. According to the method, at least one image of text is received from an image sensor, and the text may include a first portion and a second portion. The method includes identifying contextual information associated with the text, and accessing at least one rule associating the contextual information with at least one portion of text to be excluded from an audible presentation associated with the text. The method further includes performing an analysis on the at least one image to identify the first portion and the second portion, and causing the audible presentation of the first portion.
Abstract:
A system and method are provided for accelerating machine reading of text. In one embodiment, the system comprises at least one processor device. The processor device is configured to receive at least one image of text to be audibly read. The text includes a first portion and a second portion. The processor device is further configured to initiate optical character recognition (OCR) to recognize the first portion. The processor device is further configured to initiate an audible presentation of the first portion prior to initiating OCR of the second portion, and simultaneously perform OCR to recognize the second portion of the text to be audibly read during presentation of at least part of the first portion. The processor device is further configured to automatically cause the second portion of the text to be audibly presented immediately upon completion of the presentation of the first portion.
Abstract:
A device and method are provided for providing feedback based on the state of an object. In one implementation, an apparatus for processing images is provided. The apparatus may include an image sensor configured to capture real time images from an environment of a user and at least one processor device configured to initially process at least one image to determine whether an object is likely to change its state. If a determination is made that the object is unlikely to change its state, the at least one processor device may additionally process the at least one image and provide a first feedback. If a determination is made that the object is likely to change its state, the at least one processor device may continue to capture images of the object and alert the user with a second feedback after a change in the state of the object occurs.
Abstract:
A device and method are provided for audible facial recognition. In one implementation, an apparatus for aiding a visually impaired user to identify individuals is provided. The apparatus may include a portable image sensor configured to be worn by the visually impaired user and to capture real-time image data from an environment of the user. The apparatus may also include at least one portable processor device configured to determine an existence of face-identifying information in the real-time image data, and access stored facial information and audible indicators. The at least one portable processor device may also be configured to compare the face-identifying information with the stored facial information, and identify a match. Based on the match, the at least one portable processor may be configured to cause an audible indicator to be announced to the visually impaired user.
Abstract:
A device and method are provided for performing a triggered action. In one implementation, an apparatus for processing real time images of an environment of a user is provided. The apparatus may include an image sensor configured to capture image data for providing a plurality of sequential images of the environment of the user. The apparatus may also include at least one processor device configured to identify a trigger associated with a desire of the user to cause at least one pre¬ defined action associated with an object. The trigger may include an erratic movement of the object. In response to identification of the trigger, the at least one processor device may also be configured to identify a captured representation of the object. Based on at least the captured representation of the object, the at least one processor device may be configured to execute the at least one pre-defined action.
Abstract:
Apparatuses and a method are provided for providing feedback to a user, who may be visually-impaired. In one implementation, a method is provided for providing feedback to a visually impaired user. The method comprises receiving from a mobile image sensor real time image data that includes a representation of an object in an environment of the visually impaired user. The mobile image sensor is configured to be connected to glasses worn by the visually impaired user. Further, the method comprises receiving a signal indicating a desire of the visually impaired user to obtain information about the object. The method also includes accessing a database holding information about a plurality of objects, and comparing information derived from the received real time image data with information in the database. The method comprises providing the visually impaired user with nonvisual feedback that the object is not locatable in the database.
Abstract:
An apparatus and method are provided for performing one or more actions based on triggers detecting within captured image data. In one implementation, a method is provided for audibly reading text retrieved from a captured image. According to the method, real-time image data is captured from an environment of a user, and an existence of a trigger is determined within the captured image data. In one aspect, the trigger may be associated with a desire of the user to hear text read aloud, and the trigger identifies an intermediate portion of the text a distance from a level break in the text. The method includes performing a layout analysis on the text to identify the level break associated with the trigger, and reading aloud text beginning from the level break associated with the trigger.
Abstract:
Notre invention est un procédé qui permet de traduire automatiquement le contenu sémantique de l'image scientifique sous un format textuel standardisé et en temps réel. L'intérêt est de rendre l'image accessible aux malvoyants à travers les technologies d'assistance. Les exemples et réalisations non limitatifs de la présente invention ont généralement trait à la vision cognitive. En particulier à l'implémentation d'une solution d'interprétation de l'image scientifique dans le contexte de l'éducation.
Abstract:
In one example, a method includes receiving a digital note of a plurality of digital notes generated based on image data comprising a visual representation of a scene that includes a plurality of physical notes such that each of the plurality of digital notes respectively corresponds to a particular physical note of the plurality of physical notes, wherein each of the physical notes includes respective recognizable content. In this example, the method also includes receiving user input indicating a modification to one or more visual characteristics of the digital note. In this example, the method also includes editing, in response to the user input, the one or more visual characteristics of the digital note. In this example, the method also includes outputting, for display, a modified version of the digital note that includes the one or more visual characteristics.
Abstract:
A device and method are provided for recognizing text on a curved surface. In one implementation, the device comprises an image sensor configured to capture from an environment of a user multiple images of text on a curved surface. The device also comprises at least one processor device. The at least one processor device is configured to receive a first image of a first perspective of text on the curved surface, receive a second image of a second perspective of the text on the curved surface, perform optical character recognition on at least parts of each of the first image and the second image, combine results of the optical character recognition on the first image and on the second image, and provide the user with a recognized representation of the text, including a recognized representation of the first portion of text.