Abstract:
A method includes receiving first image data at an electronic device, and performing a first image recognition operation on the first image data based on a first image recognition model stored in a memory of the electronic device. The method may include sending an image recognition model update request from the electronic device to a server, in response to determining that a result of the first image recognition operation fails to satisfy a confidence threshold. The method includes receiving image recognition model update information from the server and updating the first image recognition model based on the image recognition model update information to generate a second image recognition model. The method further includes performing a second image recognition operation based on the second image recognition model.
Abstract:
Aspects of the subject disclosure may include, for example, receiving first image data for a first image from an electronic device, the first image data including first information associated with a first object identified by the electronic device as being in the first image. Second image data for a second image is received from the electronic device subsequent to receiving the first image data. The second image data includes second information associated with a second object identified by the electronic device as being in the second image. Location information is determined based on the first information and the second information and sent to the electronic device. Other embodiments are disclosed.
Abstract:
A method includes capturing a first image via a camera of an electronic device using a first set of image capture parameters. The method includes identifying a first object in the first image based on an object recognition model. The method includes sending first image data to a server. The first image data includes first information associated with the first object. The method includes capturing a second image via the camera using a second set of image capture parameters. The method includes identifying a second object in the second image based on the object recognition model. The method includes sending second image data to the server. The second image data includes second information associated with the second object. The method also includes receiving location information from the server. The location information is determined by the server based on the first information and the second information.
Abstract:
Disclosed are systems, methods and computer-readable media for enabling speech processing in a user interface of a device. The method includes receiving an indication of a field and a user interface of a device, the indication also signaling that speech will follow, receiving the speech from the user at the device, the speech being associated with the field, transmitting the speech as a request to public, common network node that receives and processes speech, processing the transmitted speech and returning text associated with the speech to the device and inserting the text into the field. Upon a second indication from the user, the system processes the text in the field as programmed by the user interface. The present disclosure provides a speech mash up application for a user interface of a mobile or desktop device that does not require expensive speech processing technologies.
Abstract:
Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify a speech synthesis context, and determine, based on a local cache of text-to-speech units for a text-to-speech voice and based on the speech synthesis context, additional text-to-speech units which are not in the local cache. The system can request from a server the additional text-to-speech units, and store the additional text-to-speech units in the local cache. The system can then synthesize speech using the text-to-speech units and the additional text-to-speech units in the local cache. The system can prune the cache as the context changes, based on availability of local storage, or after synthesizing the speech. The local cache can store a core set of text-to-speech units associated with the text-to-speech voice that cannot be pruned from the local cache.
Abstract:
Disclosed herein are systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify a speech synthesis context, and determine, based on a local cache of text-to-speech units for a text-to-speech voice and based on the speech synthesis context, additional text-to-speech units which are not in the local cache. The system can request from a server the additional text-to-speech units, and store the additional text-to-speech units in the local cache. The system can then synthesize speech using the text-to-speech units and the additional text-to-speech units in the local cache. The system can prune the cache as the context changes, based on availability of local storage, or after synthesizing the speech. The local cache can store a core set of text-to-speech units associated with the text-to-speech voice that cannot be pruned from the local cache.