Item recognition using context data

    公开(公告)号:US10043069B1

    公开(公告)日:2018-08-07

    申请号:US14196669

    申请日:2014-03-04

    Abstract: A system for recognizing objects and/or text in image data may use context data to perform object/text recognition. The system may also use context data when determining potential functions to execute in response to recognizing the object/text. Context data may be gathered based on device sensor data, user profile data such as the behavior of a user or the behavior of those in a user's social network, or other factors. Recognition processing and/or function selection may be configured to account for context data when operating to improve output results.

    PROCESSING COMPLEX UTTERANCES FOR NATURAL LANGUAGE UNDERSTANDING

    公开(公告)号:US20230032575A1

    公开(公告)日:2023-02-02

    申请号:US17882874

    申请日:2022-08-08

    Abstract: A system capable of performing natural language understanding (NLU) on utterances including complex command structures such as sequential commands (e.g., multiple commands in a single utterance), conditional commands (e.g., commands that are only executed if a condition is satisfied), and/or repetitive commands (e.g., commands that are executed until a condition is satisfied). Audio data may be processed using automatic speech recognition (ASR) techniques to obtain text. The text may then be processed using machine learning models that are trained to parse text of incoming utterances. The models may identify complex utterance structures and may identify what command portions of an utterance go with what conditional statements. Machine learning models may also identify what data is needed to determine when the conditionals are true so the system may cause the commands to be executed (and stopped) at the appropriate times.

    Determining camera auto-focus settings

    公开(公告)号:US09826156B1

    公开(公告)日:2017-11-21

    申请号:US14741248

    申请日:2015-06-16

    Abstract: A system and method of determining a tilt angle of a portable computing device using sensor data; identifying a tilt angle from a plurality of predetermined tilt angle ranges; determining focal length settings for image capture devices of the portable computing device using the tilt angle, adjustment increments, and autofocus scan range algorithms. A portable computing device including a processor; a first image capture device on a first side of the portable computing device, and a second image capture device on the second side of the portable computing device, the second side located opposite of the first side; and a memory device including instructions operable to be executed by the processor to perform a set of actions, enabling the portable computing device to perform the method.

    Image processing using multiple aspect ratios
    4.
    发明授权
    Image processing using multiple aspect ratios 有权
    使用多个宽高比的图像处理

    公开(公告)号:US09418283B1

    公开(公告)日:2016-08-16

    申请号:US14463961

    申请日:2014-08-20

    Abstract: A system to recognize text, objects, or symbols in a captured image using machine learning models reduces computational overhead by generating a plurality of thumbnail versions of the image at different downscaled resolutions and aspect ratios, and then processing the downscaled images instead of the entire image, or sections of the entire image. The downscaled images are processed to produce a combine feature vector characterizing the overall image. The combined feature vector is processed using the machine learning model.

    Abstract translation: 使用机器学习模型识别拍摄图像中的文本,对象或符号的系统通过以不同的缩小分辨率和高宽比生成图像的多个缩略图版本,然后处理缩小的图像而不是整个图像来减少计算开销 ,或整个图像的部分。 处理缩小的图像以产生表征整体图像的组合特征向量。 使用机器学习模型处理组合特征向量。

    Text orientation estimation in camera captured OCR
    6.
    发明授权
    Text orientation estimation in camera captured OCR 有权
    相机拍摄的OCR中的文本方向估计

    公开(公告)号:US09224061B1

    公开(公告)日:2015-12-29

    申请号:US14464365

    申请日:2014-08-20

    CPC classification number: G06K9/3208 G06K9/3258 G06K2209/01

    Abstract: A system estimates text orientation in images captured using a handheld camera prior detecting text in the image. Text orientation is estimated based on edges detected within the image, and the image is rotated based on the estimated orientation. Text detection and processing is then performed on the rotated image. Non-text features along a periphery of the image may be sampled to assure that clutter will not undermine the estimation of orientation.

    Abstract translation: 系统估计在检测图像中的文本之前使用手持相机拍摄的图像中的文本方向。 基于在图像内检测到的边缘估计文本取向,并且基于估计的方向旋转图像。 然后对旋转的图像执行文本检测和处理。 可以对图像周边的非文本特征进行采样,以确保杂波不会破坏取向的估计。

    Local image enhancement for text recognition
    7.
    发明授权
    Local image enhancement for text recognition 有权
    文本识别的本地图像增强

    公开(公告)号:US09058644B2

    公开(公告)日:2015-06-16

    申请号:US13800951

    申请日:2013-03-13

    Abstract: Various embodiments enable regions of text to be identified in an image captured by a camera of a computing device for preprocessing before being analyzed by a visual recognition engine. For example, each of the identified regions can be analyzed or tested to determine whether a respective region contains a quality associated with poor text recognition results, such as poor contrast, blur, noise, and the like, which can be measured by one or more algorithms. Upon identifying a region with such a quality, an image quality enhancement can be automatically applied to the respective region without user instruction or intervention. Accordingly, once each region has been cleared of the quality associated with poor recognition, the regions of text can be processed with a visual recognition algorithm or engine.

    Abstract translation: 各种实施例使得在由视觉识别引擎分析之前,在由计算设备的照相机拍摄的图像中识别文本区域以进行预处理。 例如,可以分析或测试每个所识别的区域以确定相应区域是否包含与差的文本识别结果相关联的质量,例如差的对比度,模糊,噪声等,其可以由一个或多个 算法。 在识别具有这种质量的区域时,可以在没有用户指导或干预的情况下自动地将图像质量增强应用于相应区域。 因此,一旦每个区域已被清除与识别不良相关的质量,文本区域可以用视觉识别算法或引擎进行处理。

    Determining camera auto-focus settings

    公开(公告)号:US09854155B1

    公开(公告)日:2017-12-26

    申请号:US14741201

    申请日:2015-06-16

    Abstract: A system and method of determining a tilt angle of a portable computing device using a sensor indicating gravitational pull on the device; determining the tilt angle of a camera of the device; identifying a tilt angle range from a plurality of predetermined tilt angle ranges; determining a first focal length setting using a first array that associates the tilt angle range with the first focal length setting; determining an adjustment increment using a second array that associates the adjustment increment with the tilt angle range; and determining a second focal length setting of the camera using the adjustment increment according to an autofocus scan range algorithm. A portable computing device including a processor; a camera; and a memory device including instructions operable to be executed by the processor to perform a set of actions, enabling the portable computing device to perform the method.

    Visual and audio recognition for scene change events
    9.
    发明授权
    Visual and audio recognition for scene change events 有权
    场景变化事件的视觉和音频识别

    公开(公告)号:US09536161B1

    公开(公告)日:2017-01-03

    申请号:US14307090

    申请日:2014-06-17

    Abstract: Various embodiments describe systems and methods for utilizing a reduced amount of processing capacity for incoming data over time, and, in response to detecting a scene-change-event, notify one or more data processors that a scene-change-event has occurred, and cause incoming data to be processed as new data. In some embodiments, an incoming frame can be compared with a reference frame to determine a difference between the reference frame and the incoming frame. The reference frame may be correlated to a latest scene-change-event. In response to a determination that the difference does not meet one or more difference criteria, a user interface or at least one processor of the computing device can be notified to reduce processing of incoming data over time. In response to a determination that the difference meets the one or more difference criteria, the user interface or the at least one processor can be notified that a scene-change-event has occurred. Incoming data to the computing device is then treated as new and processed as those under an active condition. The current incoming frame can be selected as a new reference frame for detecting next scene-change-event.

    Abstract translation: 各种实施例描述了随着时间的推移对于输入数据利用减少量的处理能力的系统和方法,并且响应于检测到场景改变事件,通知一个或多个数据处理器已经发生场景变化事件,以及 将传入的数据作为新数据进行处理。 在一些实施例中,输入帧可以与参考帧进行比较,以确定参考帧和输入帧之间的差异。 参考帧可以与最新的场景变化事件相关联。 响应于差异不符合一个或多个差异标准的确定,可以通知用户界面或计算设备的至少一个处理器以减少输入数据随时间的处理。 响应于差异满足一个或多个差异标准的确定,可以向用户界面或至少一个处理器通知场景变化事件已经发生。 然后将接收到计算设备的数据视为新的,并处理为处于活动状态的数据。 可以将当前输入帧选择为用于检测下一个场景改变事件的新参考帧。

    Optimizing pre-processing times for faster response
    10.
    发明授权
    Optimizing pre-processing times for faster response 有权
    优化预处理时间以加快响应速度

    公开(公告)号:US09262689B1

    公开(公告)日:2016-02-16

    申请号:US14133347

    申请日:2013-12-18

    CPC classification number: G06K9/34 G06K9/325 G06K2209/01

    Abstract: Embodiments of the subject technology provide for determining a region of a first acquired image based at least on a viewing mode and a set of respective positions of graphical elements to decrease the pre-processing time and perceived latency for the first image. One or more regions of text in the first image are detected, and a set of regions of text that overlap with the region of the image is determined and pre-processed. The subject technology may then pre-process an entirety of a subsequent image (e.g., to pick up missing text from the region of the first image). Thus, additional OCR results may be provided to the user by using the subsequent image(s) and merging subsequent results with previous results from the first image.

    Abstract translation: 本技术的实施例提供了至少基于观看模式和图形元素的各个位置的集合来确定第一获取图像的区域,以减少第一图像的预处理时间和感知等待时间。 检测第一图像中的一个或多个文本区域,并且确定并预处理与图像的区域重叠的一组文本区域。 主题技术可以预处理后续图像的整体(例如,从第一图像的区域拾取丢失的文本)。 因此,可以通过使用后续图像向用户提供附加的OCR结果,并将后续结果与来自第一图像的先前结果合并。

Patent Agency Ranking